Electronic  Imaging  Standards  for 
Archiving  Records 

Volume  I 
May  31, 1997 

19971210  025 


This  document  was  prepared  by  a  contract  team  from  Logicon 
Communications  Technology  Group  as  the  product  of  the  Deputy  Assistant 
Secretary  of  Defense,  Information  Management.  The  project  was  aided  by 
two  workshops  (October  30-'^!,  1996  and  March  4-6,  1997)  facilitated  by 
the  Operational  Process  Improvement  Office  of  the  Defense  Information 
Services  Agency.  The  project  was  accomplished  at  the  Logicon  offices. 
Sequoia  Plaza,  Arlington,  Virginia.  Logicon  is  performing  this  task  under 
DISA/JIEO/D613  contract  number  DAAB07-91-D-B519  and  TO  1820-04- 
007. 


«wna  wwuw  n4.a.«^<»k  ur  ucTMtiimciii  ur  ucrciidc  inrunMAllUN 


_ _  (Smbstnic^ontw¥9ne) 

n^hfonnktob€»tadmnqti9stjngnmwMiid  chmict  of  DoDkifotmthn  proposed  for ptABcnhastimccordMcg  with  DoDDB230^J 


TO:  Assistant  Sserstary  of  Dofanso  (Public  Affatrs) 

ATTN:  Dirsctor,  Frsodom  of  Information  &  Socnrity  Roviow,  Rm.  2C767»  Pontagon 

1.  DOCUMENT  DESCniPTION 


c.  PAGE  COUNT 

46 


2.  AUTHOR/SPEAKER 


t.  HmiMfntMiddklnitkl} 
Newlin,  D.  Burton  Jr. 


L  OFFICE 

OASD  (C3I)  ODASD  (C3 )  IT 


a.  PRESENTATION/PUBUCATION  DATA  (Osto,  Pfoco.  Bmt) 


Electronic  Imaging  Standards  for  Archiving  Records 


d.  SUBJECT  AREA 
Graphics  Information  Standards 


Computer  Specialist 


•.  AGENCY  - - 

8??ice  of  the  Secretary  of  Defense 


This  is  a  technical  report  we  would  like  to  distribute  to  the  public  and  make  available 
on  the  Internet  on  our  C3I  Records  Management  homepage. 


4.  POINT  OF  CONTACT 


a.  HkUlfUsl  first,  Middkhutiot} 

Newlin,  D.  Burton  Jr. 


S.  PRIOR  COORDINATION 


a.  HimfLMStfirsCMiddkhutiel} 


b.  TELEPHONE  NO.  dncksde  Ares  Code} 
(703)604-1591 


b.  OFFICE/AGENCY 


1.  REMMUn 


This  report  was  developed  under  contract  support  by  Logicon  with  government,  industry 
and  academia  participation.  Mr.  Newlin  was  the  COTR  for  this»;contract  task.  DoD  plans 
to  present  this  report  to  the  National  Archives  and  Records  Administration  once  it  is 
approved  for  release  by  OASD  Public  Affairs. 

. . .  .  _ 

CLEARED 

POHOPENFWBLICATION 


7.  KCOMMEM0ATIONOFSUBMrmN6OFFICE(AGENCY  — — — — — 


•.  THE  AHACHEO  MATERIAL  HAS  DEPARTMEWT/0FFICE/A6ENCY  APPROVAL  FOR  PUBLIC  BFIFASF  ffng,  ^  rfntMii 

AND  CLEAR ANCE  FOR  OPEN  PUBLICATION  IS  RECOMMENDED  UNDER  PROVISIONS  OF  DODO  Mypii  |  am  AUTHORB 
TO  MAKE  THS  RECOMMENDATION  FOR  REUASE  ON  BEHALF  OF:  W, 


b.  CLEARANCE  IS  REOUESTEO  BY 

c. 

Meyer ,  Terri 
*.  OFFICE 


OASD(C3I 


Form  1910,  AUC  90  (EG) 


”W)TORATE  FOR  FREEDOM  OF  INFORMATION 
A^©  SECURITY  REVIEW  (OASDRA) 
DEPARTWENTOFDSfENGE 


970617 


imihiODJ, 


IMaior  (Military  A 


f.  AGENCY 


h.  DATE  SIGNED 


Ptwfhosoditmmofkoasod, 


MfMd  Miinp  Mm  Pra,  «IKS/Diat  act  94 


97'J-ZZc3 


Form  Approved 
0MB  No.  0704-0188 


REPORT  DOCUMENTATION  PAGE 


Public  reporting  burden  for  this  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources, 
gathering  and  maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this 
collection  of  information,  including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services,  Directorate  Information  Operations  and  Reports,  121 5  Jefferson 

Davis  Highway,  Suite  1204,  Arlington,  VA  22202-4302,  and  to  the  Office  of  Management  and  Budget,  Paperwork  Reduction  Project  (0704-0188),  Washington,  DC  20503. 

1.  AGENCY  USE  ONLY  riesve  WanW  2.  REPORT  DATE  3.  REPORT  TYPE  AND  DATES  COVERED 

May  31,  1997  Final  Report 

4.  TITLE  AND  SUBTITLE 

Electronic  Imaging  Standards  for  Archiving  Records 

5.  FUNDING  NUMBERS 

C  -  DAAB07-91-D-B519 

TA  -  TO  1820-04-007 

DISA/JIEO/D613 

6.  AUTHOR(S) 

Logicon  Communications  Technology  Group 

for  Deputy  Assistant  Secretary  of  Defense,  Information  Management 

7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

Logicon  Communications  Technology  Group 

Sequoia  Plaza 

Arlington,  VA 

8.  PERFORMING  ORGANIZATION 
REPORT  NUMBER 

N/A 

9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 

OASD(C3I)ODASD(C3)IT 

6000  Defense  Pentagon 

Washington,  D.  C.  20301-6000 

10.  SPONSORING/MONITORING 
AGENCY  REPORT  NUMBER 

97-S-2263 

11.  SUPPLEMENTARY  NOTES 

12a.  DISTRIBUTION  AVAILABILITY  STATEMENT 

Available  in  hard  copy. 

12b.  DISTRIBUTION  CODE 

General  Distribution 

NTIS 

13.  ABSTRACT  (Maximum  200  words) 

Purpose;  Requirements  analysis  for  electronic  records  recording  formats  that  will  lead  to  the  selection  of  alternative 
standards  for  the  storage  and  retrieval  of  electronic  records  and  the  information  they  contain.  Content:  First  report  to 
provide  information  on  image  standards  agencies  with  which  the  Department  of  Defense  should  participate  in  order  to  assure 
the  needs  of  the  Department  are  addressed  by  emerging  standards.  Selection  criteria  used  in  narrowing  the  standards 
considered  are  included.  Recommendation  made  to  the  National  Archives  of  SGML  as  a  format  in  which  to  preserve  images 
of  historical  value.  Limitations  of  SGML.  Further  study  required. 

14.  SUBJECT  TERMS 

Electronic  Imagery  Records  Usage,  Storage,  National  Archives  and  Records  Administration, 
Imagery  Standards 

15.  NUMBER  OF  PAGES 

46 

16.  PRICE  CODE 

17.  SECURITY  CLASSIFICATION  18.  SECURITY  CLASSIFICATION 
OF  REPORT  OF  THIS  PAGE 

Unclassified  Unclassified 

19.  SECURITY  CLASSIFICATION 
OF  ABSTRACT 

Unclassified 

20.  LIMITATION  OF  ABSTRACT 

UL 

Standard  Form  298  (Rev.  2-89)  (EG) 
Prescribed  by  ANSI  Std.  239.18  ^  ^ 

Designed  using  Perform  Pro,  WHS/DIOR,  Oct  94 


Electronic  Imaging  Standards  for  Archiving  Records 

Table  of  Contents 
Volume  I 


Preface . v 

Executive  Summary . vi 

1.0  Introduction . 1 

Purpose 

Workshop 

2.0  Background . 2 

Historical  Development 
Digital  Challenge 

3.0  National  Archives  And  Records  Administration . 4 


DoD  Relationship 
Future  Access 
NARA’s  Challenges 

4.0  DoD  And  Electronic  Imagery  Records  Usage . 7 

Core  Missions 
DoD  Concerns 
Why  Standards  Evolved 

How  do  Standards  Map  to  Where  We  Are  Going 
Standards  Bodies  and  Associated  Products 
DoD  Guidance  on  Standards 

Commercial-Off-The-Shelf  (COTS)  and  Non-Proprietary 

Focus  Generally  Operational 

Increased  Need  for  Standards  to  Correct  Problems 

5.0  DoD  And  Electronic  Imagery  Records  Storage . 13 

Originals  and  Copies 
Varieties  of  Standards 
Recent  Accessions 
Operational  Requirements 

How  non-NARA  Archival  Agencies  Use  Standards 

Digital  Libraries 

Costs 


I 


Electronic  Imaging  Standards  for  Archiving  Records 

Table  of  Contents 
Volume  I,  Continued 

6.0  Standards  Issues  Discussed  At  Workshop . 16 

Achieving  the  Objectives 
External  Participants 
General  Principles 
Standards 
Access 

Migration  Issues 

Costs  of  Using  a  Particular  Standard 

Standards  -  Raster 

Standards  -  Vector 

Standards  -  Other 

Standards  -  Future 

Classes  of  Imagery  Standards 

Considerations  for  the  Future 

7.0  Recommendations . 27 

Image  Standards  That  NARA  and  DoD  Should  Adopt 
Imagery  Standards  Bodies  And  DoD  Participation 
New  Directions  That  Benefit  DoD  and  NARA 

8.0  Conclusion . 33 


Special  Attachments: 

Acronyms . 34 

Workshop  Participants . 36 

Endnotes . 40 


n 


Electronic  Imaging  Standards  for  Archiving  Records 


Table  of  Contents 
Volume  I,  Continued 

List  of  Figures 


Figure  1  Archives  -  Some  Significant  Dates . 2 

Figure  2  Timeline  for  Development  of  Selected  Imaging  Standards  and  Products.. .3 

Figure  3  Movement  of  Records  from  DoD  to  NARA . 5 

Figure  4  Storage  Space  Requirements . 5 

Figure  5  Core  Organizational  Issues  for  NARA  and  DoD . 7 

Figure  6  Amount  of  Paper  in  Declassification  Project . 8 

Figure  7  Comparison  of  Imaging  Standards  Cost . 1 8 

Figure  8  Proposed  Flow  of  Information  to  NARA . 32 


List  of  Tables 


Table  1  Formal  and  Informal  Standards . 10 

Table  2  Characteristics  of  Selected  Image  Formats . 23 

Table  3  Document  Types  and  Related  Standards . 25 

Table  4  Standards  Relationships  Addressed  at  Workshop . 28 

Table  5  SGML  Characteristics . 28 

Table  6  Standards  Bodies  and  DoD  Representatives . 30 

Table  7  Standards  Bodies  and  Represented  Standards . 3 1 


iii 


Electronic  Imaging  Standards  for  Archiving  Records 

Table  of  Contents 
Volume  II,  Appendices 


Appendix  A 
Appendix  B 
Appendix  C 
Appendix  D 
Appendix  E 
Appendix  F 
Appendix  G 
Appendix  H 


Standards  and  Standards  Organizations 
Workshop  Products  and  Information 
Glossary  of  Terms 
Sources 

Related  Articles 

NARA  Accessions  from  DoD,  FY1993-FY1996 
Federal  Laws 

DoD  Representatives  to  Standards  Bodies 


IV 


Preface 


As  Executive  agent  for  the  Department  of  Defense  (DoD)  the  Office  of  the 
Assistant  Secretary  of  Defense  (Command,  Control,  Communications  and  Intelligence) 
requires  continued  technical  assistance  in  support  of  Functional  Process  Improvements 
within  the  Department.  It  tasked  Logicon  to  conduct  a  requirements  analysis  for 
electronic  records  recording  standards  that  will  lead  to  selection  of  alternative  standards 
for  the  storage  and  retrieval  of  electronic  records.  This  is  in  support  of  44  United  States 
Code  (USC)  and  36  Code  of  Federal  Regulations  (CFR)  requirements  for  storage  of 
electronic  records  with  the  National  Archives  and  Records  Administration  (NARA). 

This  report  includes  the  sections  listed  below.  Section  1 .0,  “Introduction,” 
provides  a  synopsis  of  the  purpose  and  authority  for  this  report.  Section  2.0, 
“Background,”  arranges  the  history  and  current  status  of  archiving  electronic  records. 
Section  3.0,  “National  Archives  and  Records  Administration,”  describes  key  issues 
relating  to  the  organization  receiving  archived  records.  Section  4.0  “DoD  and  Electronic 
Imagery  Records  Usage,”  depicts  the  evolution  of  imagery  records  usage  and  the  needs 
for  standards.  Section  5.0  “DoD  and  Electronics  Records  Storage,”  explains  how  DoD 
and  other  entities  store  electronic  records,  emphasizing  images.  Section  6.0  “Specific 
Standards  Issues  Discussed  at  Workshop”  analyzes  requirements  for  imaging  standards 
and  methods  discussed  by  subject  matter  experts  at  two  groupware  workshops.  Section 
7.0  “Imagery  Standards  Bodies  and  DoD  Participation,”  presents  the  standards  bodies  or 
similar  organizations  DoD  should  work  with  in  setting  up  de  jure  or  de  facto  standards. 
Section  8.0  “Imagery  Standards  Issues,”  discusses  direction  for  DoD  and  difficulties  with 
imagery  standards  in  archiving.  Section  9.0  “Recommended  Imagery  Standards,” 
presents  the  results  of  the  analysis,  listing  the  recommended  formats,  suggesting 
directions  for  the  future  of  digital  imagery  archiving.  Section  10.0  “Conclusion,”  offers 
the  conclusions  of  the  study  and  suggests  future  directions  and  research  needs.  Volume  II 
contains  appendices  with  corroborating  or  background  material. 


Executive  Summary 


The  purpose  of  this  report  was  to  conduct  a  requirements  analysis  for  electronic  records 
recording  formats  that  will  lead  to  the  selection  of  alternative  standards  for  the  storage 
and  retrieval  of  electronic  records  and  the  information  they  contain.  Using  two 
workshops  as  the  basis  for  the  source  material,  the  study  reflects  the  considerable 
progress  made  in  that  direction.  Criteria  were  applied,  some  solutions  found,  and 
directions  to  follow  to  resolve  the  remainder  established. 

Specifically,  Image  Standards  Agencies  the  Department  of  Defense  should  participate 
with  are  identified.  Additionally,  DoD  is  advised  to  pursue  relationships  with 
commercial  producers.  As  more  products  are  purchased  commercial-off-the-shelf,  the 
Department  should,  as  a  customer,  work  with  its  providers.  This  means  contacting 
organizations  such  as  Adobe  to  explain  how  DoD  might  benefit  from  improvements  to 
PDF,  or  deal  with  the  consortium  developing  FlashPix,  to  share  needs  before  the 
specifications  are  complete. 

The  selection  criteria  established  by  the  group  narrowed  the  assemblage  of  standards  to 
consider  to  less  than  a  dozen.  None  of  this  dozen  were  immediately  eliminated  from 
consideration,  because  they  need  further  research.  One,  SGML,  should  be  accepted  by 
the  National  Archives  and  the  Department.  The  caveat  is  that  DoD  organizations  now 
using  SGML  should  be  allowed  to  preserve  and  deliver  to  NARA  their  data  in  this 
format.  The  high  cost  of  taking  documents  into  this  format  makes  it  ineffective  as  a 
device  for  storing  all  archives.  For  example,  storing  the  one  billion  pages  subject  to 
imaging  and  redaction  in  an  ongoing  declassification  effort  could  cost  as  much  as 
$4,750,000,000,  an  impossible  figure  to  justify  in  the  budget. 

To  complete  this  study,  the  department  needs  to 

•  Determine  the  life  cycle  costs  of  using  each  potential  standard  and  plan  methods 
for  the  inclusion  of  this  figure  in  the  DoD  budget. 

•  Survey  organizations  to  determine  volume  and  formats  of  images 

•  Assemble  a  team  of  experts  to  include  new  profiles,  standards,  conformance 
testing  and  certification  efforts,  test  suite  generation  and  promulgation  efforts,  and  joint 
industry  and  government  initiatives. 

•  Complete  the  evaluation  of  existing  standards  using  the  workshop  criteria 

•  Update  and  revise  affected  DoD  publications  as  required 

•  As  technology  advances,  review  the  imagery  policies  and  standards  and  update 
on  a  periodic  basis 

•  Evaluate  the  need  for  a  DoD  digital  library  allowing  access  to  and  less  costly 
preservation  of  data  prior  to  archiving  with  NARA 

Taking  this  project  to  its  next  logical  steps  will  provide  DoD  with  a  way  to  follow  the 
instructions  of  archiving  laws  while  offering  greater  access  to  digital  information. 


1.0  INTRODUCTION 


Purpose.  The  purpose  of  this  report  is  to  conduct  a  requirements  analysis  for  electronic 
records  recording  formats  that  will  lead  to  the  selection  of  alternative  standards  for  the 
storage  and  retrieval  of  electronic  records  and  the  information  they  contain.  The 
requirements  analysis  included  a  surve\'  of  formats  being  used  across  the  Department  of 
Defense  (DoD),  Library  of  Congress,  National  Archives  and  Records  Administration 
(NARA)  and  other  entities.  Using  a  groupware  workshop  technique  involving  a  diverse 
group  of  subject  matter  experts,  the  team  reviewed  the  requirements  of  NARA,  and  found 
that  methods  already  existed  to  store  text  materials.  No  method  existed  to  store  electronic 
images,  and  images  make  up,  by  volume,  the  majority  of  DoD  accessions  to  NARA.  As 
a  result  guidance  from  the  sponsoring  agency  focused  the  research  and  analysis  on 
imagery  standards  DoD  should  recommend  to  NARA  for  acceptance  in  archiving  DoD 
image  records. 

This  report  documents  and  recommends  a  set  of  imaging  standards  to  consider  in 
archiving  electronic  imagery  records.  In  this  information  age,  more  records  are  being 
made  available  as  images.  The  major  types  of  images  fall  into  four  categories  of 
business,  technical,  personnel,  and  medical.  These  include  spreadsheets,  drawings, 
forms,  pietures,  weather,  satellite  images,  models  and  simulation,  video,  television, 
multimedia,  finger  prints,  x-rays,  MRI,  CAT  Scans,  EKG’s  and  images  of  pages  of  text. 
The  DoD  has  the  majority  of  the  reeords  at  National  Archives  and  Records 
Administration  (NARA),  but  NARA  has  no  standards  for  imagery  information. 
Currently  the  NARA  only  aeeepts  eleetronic  reeords  in  the  American  Standard  Code  for 
Information  Interchange  (ASCII)  and  Extended  Binary  Coded  Decimal  Interchange  Code 
(EBCDIC)  formats  and  on  magnetic  tapes  or  CD-ROM.  These  two  standards  only 
address  the  formats  for  electronic  text  materials.  The  objeetive  of  this  analysis  was  to 
improve  information  systems  by  identifying  and  recommending  an  alternative  set  of 
storage  and  retrieval  standards  for  electronic  imagery  records  information. 

Workshops.  Two  groupware  workshop  sessions  were  held  with  representatives  from 
DoD,  Federal  Government,  Industry  and  Academe.  (See  Appendix  B)  These  workshops 
identified  and  documented  DoD’s  major  imagery  records  requirements  and  proposed  a 
minimum  set  of  imagery  standards  supported  by  the  market  place  that  will  be  acceptable 
to  both  DoD  and  NARA  for  long  term  imaging  archiving  as  the  imaging  teehnology 
evolves.  This  report  summarizes  the  findings  and  recommendations  of  the  participants 
and  experts  that  attended  these  workshops.  The  findings  identify  for  NARA  the  specific 
image  standards  requirements  that  will  increase  efficiency,  effectiveness  and  usefulness 
of  archival  information.  They  also  identify  the  commercial  standards  that  could  satisfy 
requirements  and  they  identify  technical  imagery  standards  bodies  that  the  government 
should  participate  with  in  developing  the  requirements  needed  by  the  Department  of 
Defense  and  NARA. 


1 


2.0  BACKGROUND 


From  ancient  times  people  have  maintained  an  image  and  textual  record  of  their 
condition  and  their  progress.  Over  the  past  two  thousand  years  little  varied  in  the  reasons 
for  preserving  information  or  the  types  of  records  stored;  theological,  government, 
personal,  business,  scientific  and  academic  research  all  needed  safe  and  accessible 
storage.  Old  issues  of  selection  (accession),  medium  of  storage,  security,  authenticity  of 
the  information,  ownership,  accessibility,  standardization,  loss  of  data  or  information,  and 
cost,  are  still  issues.  Some  conventions  and  methods  of  storing  this  information  changed 
over  the  years  --  to  suit  the  occasion  or  times;  maps  on  goatskins,  paper,  clay,  brass,  or 
even  silk;  different  methods  of  painting  evolved.  New  technologies  propelled  new 
methods  such  as  film,  movies,  phonograph  records,  sound  and  video  tapes,  television,  and 
computers  to  share  and  store  information.  (See  Figure  1) 


Preserving  The  Human  Record  -  Selected  Times 


Maps  oil  Paper,  Silk,  &  Brass  ^ 

Arab  Libraries,  Medieval  Archive^wJ 
Prititing  Press 

Pulp  Paper/Camera  Photos 
Improved  Printing  Presses 
Moving  Picture.s/Phonograpli  Records 

Radlo/Sound  Movics/Color  Film/Television/Aerial  Ptiotos 


20.000  BC 


700  -  800 
1500 


Coniputers/Color  Television/Satetlite  Imagery/Personal  Computen 


1910-1950 


1950-1980 


Figure  1.  Archives  -  Some  Significant  Dates 

Digital  Challenge.  The  advent  of  the  computer  significantly  altered  the  pace  of  change. 
This  challenges  people  interested  in  creating  standards  and  preserving  records.  (See 
Figure  2) 


2 


X  Projected 


TEXT 

Products 

.  •  Excalibur  Text 

IMultimate  .  Microsoft  Word  ,Wnrd 

•Wordstar  x  woru 

•WordPerfect 

Standards  or 

•ASCII 

Specifications 

•EBCDIC 

IMAGE 

•  Excalibur  Image 

Products 

•  Adobe  Acrobat 

Standards  or 

•tiff  'JPEG 

Specifications 

•SGML.GIF  -PDF  ‘JBIG 

VIDEO/AUDIO 

Products 

Standards  or 

•8  mm  •Beta  •  MPEGl 

Specifications 

•  16  mm  .  VHS  •  MPEGl  •  MPEG4 

WEB 

•  Netscape 

Products 

•  Microsoft  Explorer 

Standards  or 

•  HTML 

Specifications 

•  Java 

T - r 


Note  1:  Not  all  inciusive 


1970  i  1980 


1996  2000 


Figure  2,  Timeline  for  Development  of  Selected  Imaging  Standards  and  Products.  For  definitions  of 
acronyms  see  Chapter  5. 


In  the  past,  preserved  items  were  generally  visible  and  storage  methods  altered  slowly. 
For  example,  a  book  properly  accessioned,  cataloged,  and  stored  could  reside  unaltered  in 
the  archives,  even  if  well  used,  for  over  a  century.  Standards  applied,  such  as  use  of 
alphabets,  pagination,  or  placement  of  text,  could  vary  slightly  without  creating  problems. 
A  researcher  may  have  no  difficulty  reading  a  200  year  old  document.  Electronic  records 
pose  more  storage  challenges.  Formats  often  vary  with  great  speed.  As  an  example,  take 
a  word  processed  text  stored  in  WordStar  on  a  5  1/4  inch  disk  in  1987.  First,  in  1997, 
very  few  sources  can  still  read  the  WordStar  format.  Second,  the  percentage  of  5  1/4  disk 
compatible  computers  decreased.  Both  the  format  and  the  technology  changed  markedly 
in  the  past  ten  years.  For  electronic  images,  the  rate  of  change  was  even  faster. 


In  a  meeting  sponsored  by  the  Australian  Archives,  scholar  Maggie  Exon  said, 
“Unfortunately,  as  we  know,  digital  materials  do  provide  particular 
problems  for  preservation.  There  is  a  very  real  possibility  that  nothing 
created,  stored,  and  disseminated  electronically  will  survive  in  the  long 
term.  The  problem  does  need  to  be  stated  this  dramatically.  I  have  an 
unfailing  sinking  feeling  whenever  anybody  links  the  concepts  of 
digitization  and  preservation.  I  have  a  profound  and  unchanging  disbelief 
that  these  two  concepts  belong  in  any  sense  in  the  same  world.”* 


The  challenge  is  to  preserve  and  maintain  electronic  records  in  a  usable  form,  particularlv 
in  the  case  of  graphic  images. 


3 


3.0  NATIONAL  ARCHIVES  AND  RECORDS  ADMINISTRATION 


The  National  Archives  and  Records  Administration  (NARA),  an  independent  federal 
agency,  acquires,  preserves,  and  makes  available  for  research  records  of  enduring  value 
created  or  received  by  organizations  of  the  executive,  legislative,  and  judicial  branches  of 
the  Federal  Government.  (For  some  specific  legal  requirements,  see  Appendix  F) 
NARA's  33  facilities  that  store  this  information  house  about  21.5  million  cubic  feet  of 
original  textual  materials  --  about  the  equivalent  of  4  billion  pages  of  text.  The 
multimedia  collections  contain  nearly  300,000  reels  of  motion  picture  film,  more  than  5 
million  maps,  charts  and  architectural  drawings,  more  than  200,000  sound  and  video 
recordings,  more  than  9  million  aerial  photographs,  nearly  14  million  still  pictures  and 
posters,  and  about  29,000  computer  data  sets.^  The  archives  also  have  over  300,000 
roles  of  microfilm.^  These  records  covered  the  period  from  the  Continental  Congress  to 
1994  and  a  wide  range  of  topics  including  policy,  civilian  and  military  personnel  records. 

DoD  Relationship.  The  Department  of  Defense  is  the  largest  single  contributor  to  the 
archives  and  deposits  records  in  text,  audio-visual,  electronic  and  other  formats.  The 
largest  single  area  by  sheer  volume  involves  imagery  —  photographs,  films,  engineering 
drawings,  blueprints,  and  other  such  records.  The  volume  of  transfers  varies  by  year  and 
depends  on  the  world  order  —  for  example  the  records  for  World  War  II  greatly  exceed 
the  total  for  the  preceding  twenty  years.  With  the  passage  of  time  and  changing 
technology  the  types  of  items  retained  has  also  varied.  The  importance  of  these  records  is 
apparent  in  their  use  in  many  documentaries,  in  research,  and  their  continued  use  for 
training  and  in  reviewing  current  policy  and  plans.  An  example  of  the  widespread  need 
for  these  files  is  the  1,500,000  requests  annually  for  information  on  veterans  records. 

To  get  an  idea  of  the  current  requirements  this  project  surveyed  the  accessions  to  NARA 
from  DoD  from  fiscal  year  1993  through  fiscal  year  1996.  DoD  transferred  over  36,000 
cubic  feet  of  textual  records,  1500  cubic  feet  of  audio-visual  materials,  60,000  cubic  feet 
of  aerial  photographs,  4,000  cubic  feet  of  other  photographic  images  and  drawings  in 
traditional  media.  The  electronic  records  collection  included  7,800,000  records  from  the 
Army  Surgeon  General,  Southeast  Asia  Combat  Area  Casualty  Files,  contract  data,  and 
several  database  records.  (See  Appendix  E) 

For  the  Department  of  Defense,  the  process  of  transferring  records  to  the  Archives  is 
fairly  clear.  Records  managers  at  DoD  and  archivists  at  NARA  agree  on  a  schedule  for 
the  records.  Under  these  schedules,  and  the  General  Records  Schedules  that  apply  to  all 
Federal  Agencies,  DoD  and  NARA  establish  retention  periods  for  all  DoD  records.  At 
prescribed  time  periods  managers  destroy  some  records  on-site,  transfer  others  to  the 
appropriate  Federal  Records  Center  (FRC)  for  long-term  storage,  and  transfer  a  small 
portion  to  the  National  Archives  for  permanent  retention  either  directly  from  DoD 
facilities  or  after  a  period  of  time  in  an  FRC.  Because  of  preservation  concerns,  NARA 
accessions  electronic  records  directly  from  DoD  facilities.  (See  Figure  3) 


4 


g1piper,'|^#fe| 


FRCs  receive  less 
than  10%  of  all 
documents.  Generally 


NARA  accessions 
about  3‘’/oofall 
DoD  records. 


Figure  3.  Movement  of  Records  from  DoD  to  NARA. 


Future  Access.  To  ensure  future  aeeess  to  eleetronic  reeords.  NARA.  has  required  that 
all  such  records  must  conform  with  national  and  international  encoding  and  recording 
standards.  Currently  NARA  requires  that  ail  electronic  records  adhere  to  either  the 
American  Standard  Code  for  Information  Interchange  (ASCII)  or  Extended  Binary  Coded 
Decimal  Interchange  Code  (EBCDIC)  format  and  be  stored  on  tape  or  CD-ROM.  This 
system  fails  to  address  images,  the  majority  of  non-digital  accessions  (by  volume)  from 
DoD.  An  added  problem  is  that  digital  images  take  up  a  great  deal  more  computer 
memory  space  than  comparable  text.  (See  Figure  4)  Because  of  this  difficulty  with 
storing  images  and  their  relative  importance,  they  became  the  focus  of  this  study. 

NARA’s  Challenges.  NARA’s 
task  of  preserving  electronic 
records  is  expanding  rapidly. 
Likewise  the  complexity  and 
costs  associated  with  this  task 
lead  several  sources  in  America, 
including  NARA  leaders,  to 
question  the  ability  of  the 
National  Archives  to  store 
required  electronic  records  at 
Text  and  Image  digital  space  requirements  for  equivalent  current  funding  levels.  The  task 
to  one  8  1/2”  X  11  page.  confronting  them  is  (^""’nting,  to 

Figure  4.  Storage  Space  Requirements  save  all  important  records,  yet  to 


5 


do  so  only  in  formats  that  can  easily  migrate  from  an  old  format  to  a  new  one,  probably 
every  five  to  ten  years.  Preservation  and  storage  of  electronic  records  evolved  into  one  of 
the  most  expensive  portions  of  the  records  life  cycle.  In  the  paper  world,  the  costs  of 
storage  are  usually  greatest  at  accession  —  the  physical  accessioning,  the  cataloging,  the 
storage  location,  are  all  up  front  costs.  Once  the  object  is  in  place  the  costs  are  fairly 
steady  and  known.  John  Carlin,  Archivist  of  the  United  States,  in  NARA’s  Strategic 
Plan  noted  that  just  the  cost  of  storing  paper  records  consumes  almost  half  of  NARA’s 
annual  budget.  This  leaves  insufficient  money  to  take  care  of  needed  major  facility 
repairs  or  the  additional  space  needed.  He  also  noted  the  problems  of  the  “PROFS  ease,” 
in  which  courts  required  the  archives  to  store  White  House  e-mail  and  electronic  records, 
yet  which  were  unfunded  in  the  budget.^  Looking  at  the  NARA  budget  a  National 
Research  Council  report  discussing  the  archiving  of  scientific  records  even  went  so  far  as 
to  say,  “NARA’s  budget  for  the  Center  for  Electronic  Records,  which  has  the  formal 
responsibility  for  archiving  all  types  of  federal  electronic  records,  was  only  $2.5  million 

in  FY  1994,  a  budget  lower  than  that  of  many  of  the  individual  agency  data  centers . 

Given  NARA’s  current  and  projected  level  of  effort  for  archiving  electronic  scientific 
data,  it  is  obvious  that  NARA  will  be  unable  to  take  custody  of  the  vast  majority  of  these 
scientific  data  sets.”^  Though  DoD  does  not  wish  to  be  an  archiving  agency,  it  must  look 
at  this  issue  and  consider  at  least  temporary  methods  to  use  until  NARA  can  assume  the 
full  responsibility  for  archiving  electronic  records.  Recognizing  the  primacy  of  the 
National  Archives  in  deciding  archival  matters,  such  decisions  should  be  made  in  concert 
with  NARA’s  Center  for  Electronic  Records. 


6 


4.0  DOD  AND  ELECTRONIC  IMAGERY  RECORDS  USAGE 


Core  Missions.  While  NARA’s  core  mission  is  to  preserve  records  and  provide  access  to 
them,  the  core  mission  of  the  Department  of  Defense  is  warfighting.  (See  Figure  5) 


Core  Issues  -  NARA  and  DoD 


NARA  DoD 


Figure  5.  Core  Organizational  Issues  for  NARA  and  DoD 


Because  of  this  focus  DoD  programmers  and  users  tend  to  view  digital  information  in 
terms  of  enhancing  mission  effectiveness.  This  operational  viewpoint  largely  precluded 
concern  with  long  term  preservation.  The  Department  leadership  readily  accepts  the  need 
for  preservation  and  has  routinely  cooperated  with  NARA  in  this  regard.  In  addition  to 
being  the  law,  it  remains  in  the  department’s  own  interest,  for  the  archived  materials 
provide  continuity  information,  training  materials,  and  research  sources  on  how  to  better 
conduct  war.  Still,  because  of  the  organizational  priorities,  those  developing  or  gathering 
digital  data  rarely  focus  on  preservation,  nor  should  they.  The  operational  side  of  needs 
always  comes  first.  Still,  where  practical,  the  department  wants  the  ability  to  retain 
access  to  data. 

DoD  Concerns.  Several  examples  of  the  operational  concern  exist  in  DoD.  One  is  the 
push  by  DoD  Components  to  achieve  equipment  interoperability.  At  individual  facilities 
information  mangers  attempt  to  minimize  “stovepiping,”  or  the  ereation  of  databases  that 
can  not  communicate  with  one  another. 

In  this  setting  one  of  the  significant  changes  of  recent  years  is  the  significant  increase  in 
the  use  of  digitized  imagery.  Operationally,  many  examples  are  of  day  to  day  use,  but 
would  not  be  archived.  Other  items,  like  the  dramatic  Joint  Surveillance  Target  Attack 
Radar  System  (J-STARS)  images  depicting  the  “Highway  of  Death”  during  the  Persian 
Gulf  War  deserve  preser  .:!on.  To  understand  all  the  development  of  the  battles  and  war, 
digital  battle  orders  and  directives  require  capture. 


7 


On  the  technical  side,  the  engineering  drawings  increasingly  rely  on  the  computer 
monitor  instead  of  the  drafting  table.  Some  computer  programs  even  allow  three 
dimensional  viewing.  Service  directives  require  retention  of  engineering  drawings  of 
ships  for  their  life.  This  requirement  remains  for  digital  drawings.  This  at  a  time  when  it 
is  not  unusual  for  the  armed  forces  to  retain  ships  and  aircraft  for  thirty  or  even  forty 
years.  Personnel  offices  digitally  store  images  of  fingerprints.  Medical  records  managers 
must  preserve  digital  MRIs.  CAT  Scans,  or  even  x-ravs  of  service  members.  The 
National  Imagery  and  Mapping  Agency  (NIMA,  the  successor  to  the  Defense  Mapping 
Agency)  increasingly  digitizes  geographical  information  systems  (GIS)  data.  Digital 
enhancement  of  photos  can  produce  better  images  than  the  original,  and  these  images 
often  deserve  preservation.  Retention  of  this  information  is  valuable  to  the  Services. 

Related  to  this,  the  Department  of  Defense  must  complete  an  interesting  legacy  by  the 
year  2000.  That  is  to  declassify  almost  one  billion  pages  of  documents.  This  is  the 
equivalent  of  six  hundred  stacks  of  paper  piled  to  the  height  of  the  Washington 
monument.  A  rather  daunting  task,  this  involves  making  images  of  the  original  pages  and 
redacting  them  (blacking  out  the  items  still  classified).  Traditionally  done  with  paper,  the 

Department  plans  to  digitally  scan  the 
images  using  a  TIFF  specification. 
In  this  instance  the  storage  now 
involves  the  original  page  and  a 
digital  image.  This  greatly  speeds  up 
the  process  and  saves  money. 

Production  of  these  images  is 
possible  because  of  specific  standards 
implemented  to  allow  their  creation. 
Within  the  Department  of  Defense, 
requirements  exist  to  go  beyond 
international  standards,  to  meet 
specific  DoD  criteria  designed  to  minimize  duplication  and  to  enhance  interoperability. 

Why  Standards  Evolved.  Standards  in  most  disciplines  developed  over  time  in 
response  to  a  need  for  uniformity.  Examples  of  this  are  standards  for  screw  threads  and 
for  electrical  power.  Standards  work  in  the  information  technology  field  has  only 
recently  begun,  with  most  of  the  work  in  the  last  ten  years.  Standards  for  Fax  machines 
and  VCRs  are  examples  of  standards  created  in  response  to  the  development  of  a  new 
technology. 

The  proliferation  of  computers  and  the  rate  of  change  of  technology  has  caused  major 
problems  for  the  standards  bodies.  Some  estimates  indicate  that  more  than  50  percent  of 
all  new  standards  pages  being  developed  these  days  are  in  the  field  of  information 
technology.  In  this  rapidly  changing  field,  timing  is  critical  to  the  success  of  a  standard. 


Stack  of  One  Billion  Pages  of  Paper  is  600 
Times  as  High  as  the  Washington  Monument 


Figure  6:  Amount  of  Paper  in  Declassification  Project 


8 


Pushing  a  standard  too  early  entails  the  risk  of  entrenching  an  approach  or  technology 
that  does  not  meet  real-world  needs.  Creating  standards  that  do  not  serve  commercial 
interests  slows  down  the  growth  of  the  entire  industry.  On  the  other  hand,  standardizing 
too  late  causes  arbitrary  diversity  and  wasted  investment. 

At  the  ^^'orkshop.  Steve  Carson  noted  the  key  results  or  benefits  of  standardization  are: 

•  Enhanced  product  quality  and  reliability  at  a  reasonable  price 

•  Improved  health,  safety  and  environmental  protection,  and  reduction  of  waste 

•  Greater  compatibility  and  interoperability  of  goods  and  services,  giving  the  purchaser 
more  flexibility  in  equipment  selection  and  use 

•  Simplification  for  improved  usability 

•  Reduction  in  the  number  of  models,  and  thus  reduction  in  costs 

•  Increased  distribution  efficiency,  and  ease  of  maintenance 

•  Increased  assurance  that  there  will  be  a  large  market  for  a  particular  piece  of 
equipment  or  software 

The  principal  disadvantage  of  standards  is  their  tendency  to  freeze  technology.  In  the 
time  it  takes  to  develop,  subject  to  review  and  compromise,  and  promulgate  a  new 
standard,  more  efficient  or  new  technologies  may  appear. 

How  do  Standards  Map  to  Where  We  Are  Going 

The  standards  considered  in  the  group  session,  SGML,  PDF,  TIFF,  HTML  and  others, 
were  all  designed  and  developed  to  interchange  operational  graphics.  While  some  such 
products  are  in  the  developmental  stage,  we  have  not  found  any  standards  or  products 
specifically  designed  for  the  archiving  of  records. 

The  group  discussed  a  great  number  of  different  documents  that  could  possibly  be 
candidates  for  storage  by  NARA.  These  documents  include  redacted  declassified 
documents.  Defense  Intelligence  Agency  (DIA)  images,  GIS  documents,  and  medical 
records.  To  store  this  vast  array  of  images,  no  single  standard  could  fulfill  all  the 
requirements. 

Standards  Bodies  and  Associated  Products 

The  major  formal  organizations  which  develop  or  facilitate  the  development  of  standards 
for  information  processing  are:  the  International  Organization  for  Standardization  (ISO), 
the  American  National  Standards  Institute  (ANSI),  the  Institute  for  Electrical  and 
Electronic  Engineers  (IEEE),  the  International  Telecommunications  Union  (ITU),  and  the 
Internet  Society  (ISOC)  which  includes  the  Internet  Architecture  Board  (lAB)  and  the 
Internet  Engineering  Task  Force  (IETF). 

Informal  standards  are  developed  by  either  single  companies  (Microsoft’s  RTF),  an 
alliance  of  a  few  companies  (TIFF  developed  by  Microsoft,  Aldus,  and  Hewlett-Packard), 
or  a  consortia  of  many  vendors  (the  Open  Software  Foundation’s  Distributed  Computing 


9 


Environment).  These  standards  developers  have  no  formal  accreditation  and  release  their 
products  into  the  public  domain. 

A  de  facto  (informal)  standard  emerges  from  popular  use  of  a  technology,  but  accredited 
standards  organizations  create  de  jure  (formal)  standards.  A  technology  becomes  a  de 
facto  standard  in  the  industry  by  “popular  vote,”  the  purchase  of  one  product  over  all 
competing  products.  The  “winning”  standard  may  reflect  a  compromise  of  competing 
standards  and  may  not  be  the  most  technologically  superior.  By  achieving  greater  market 
penetration,  VHS  became  the  standard  for  video  cassette  recorders,  even  though  many 
considered  BETA  to  be  the  better  product.  Table  1  shows  some  of  the  formal  and 
informal  standards  related  to  the  archiving  of  electronic  images.  (See  Appendix  A  for  a 
description  of  each  of  the  standards  and  standards  organizations.) 


Table  1.  Formal  and  Informal 

Standards 


Type 

Sponsor 

Standard 

Formal 

ISO 

SGML,  JPEG,  MPEG,  CGM 

ITU 

Group  3  and  Group  4  fax 

lAB 

HTML 

Informal 

Microsoft,  Aldus,  HP 

TIFF 

CompuServe 

GIF 

Adobe 

PDF 

DoD  Guidance  on  Standards 

DII  -  The  Defense  Information  Infrastructure  (DII)  encompasses  all  DoD  shared 
resources  (hardware  and  software)  that  make  up  systems  and  networks  to  support  DoD’s 
global  mission.  There  is  a  heightened  emphasis  within  DoD  on  the  use  of  commercial 
items,  practices,  and  processes,  resulting  in  extensive  efforts  to  replace  military  and 
federal  specifications  and  standards  with  Non-Government  Standards  (NGS).  DoD’s 
Standardization  Program  Division  has  established  a  policy  encouraging  the  use  of 
standards  from  voluntary  standards  bodies  whenever  practical  and  appropriate.  Adoption 
of  voluntary  standards  eliminates  the  cost  to  the  Government  of  developing  its  own 
standards.  The  end  result  is  a  strategy  for  fielding  systems  with  increased  interoperability, 
reduced  development  time,  increased  operational  capability,  minimized  technical 
obsolescence,  minimal  training  requirements,  and  minimized  life  cycle  costs. 

TAFIM  and  JTA  -  The  Joint  Technical  Architecture  (JTA)  and  Technical  Architecture 
Framework  for  Information  Management  (TAFIM)  are  both  DoD  guidance  documents 
currently  in  effect  within  DoD.  The  TAFIM  provides  general  guidance  and  documents 
the  processes  and  framework  for  defining  the  JTA  (and  other  technical  architectures). 
The  TAFIM  applies  to  many  DoD  mission  or  domain  areas  and  lists  all  adopted 
information  technology  standards  that  promote  interoperability,  portability,  and 


10 


scalability.  The  JTA  currently  focuses  on  Command.  Control,  Communications, 
Computers  and  Intelligence  (C4I)  requirements  as  related  to  interoperability  by 
identifying  the  minimum  set  of  standards  for  service  areas  (one  standard  per  function 
where  possible).  For  the  C4I  service  areas  domain,  the  JTA  set  of  standards  supersedes 
those  listed  in  the  TAFIM. 

The  mandated  standard  in  the  JTA  and  TAFIM  for  document  interchange  is  Standard 
Generalized  Markup  Language  (SGML).  This  is  an  ISO  approved  standard  for  the 
production  of  documents  intended  for  long-term  storage  and  electronic  dissemination  for 
viewing  in  multiple  formats.  Other  JTA  mandated  standards  related  to  electronic  data 
interchange  include;  Computer  Graphics  Metafile  (CGM),  an  interchange  format  for 
vector  graphics;  JPEG  for  still  picture  interchange;  and  MPEG  for  video  data  interchange. 
(See  Appendix  A  -  Standards  and  Standards  Bodies  for  more  information  about  the 
TAFIM  and  JTA.) 

Commercial-Off-The-Shelf  (COTS)  and  Non-proprietary  Standards  and  Products  - 

Software  standards  to  accommodate  all  the  services  and  interfaces  are  still  emerging. 
Although  international  and  national  standards  groups  have  defined  hundreds  of 
information  processing  standards,  they  continue  the  incomplete  task  of  defining  formal 
standards  for  all  functional  areas  needed  for  real-world  systems. 

To  fill  this  standards  gap,  organizations  will  often  have  to  standardize  around  a  vendor’s 
COTS  product.  Some  of  these  products  have  become  defacto  standards  or  specifications. 
PDF  and  Microsoft’s  DOS  and  Windows  are  good  examples  of  vendor  products  that  have 
become  de  facto  standards. 

Most  organizations  desire  the  ability  to  purchase  products  based  on  international  or 
national  standards.  This  is  not  always  possible  or  even  desirable,  sinee  the  process  of 
becoming  a  standard  often  takes  years  and  during  that  time  new  defaeto  standards  based 
on  new  technology  are  in  the  marketplace.  The  organization  then  has  to  make  the 
decision  whether  to  go  with  a  dejure  standard  and  old  teehnology  or  new  technology  and 
be  at  the  merey  of  the  product  vendor. 

Focus  Generally  Operational  -  The  JTA  and  the  TAFIM  focus  primarily  on  the 
operational  environment  and  the  warfighter  and  do  not  address  the  disposition  of  records 
after  the  information  is  no  longer  required  for  operational  reasons.  The  following 
quotations  come  from  the  JTA  executive  summary. 

The  Joint  Technical  Architecture  (JTA)  provides  the  "building  codes"  that, 
when  implemented,  permit  this  flow  of  information  in  support  of  the  Warfighter. 
The  JTA  identifies  a  common  set  of  mandatory  information  technology  standards 
and  guidelines  to  for  use  in  all  new  and  upgraded  C4I  acquisitions  across  DoD. 
The  JTA  standards  are  for  sending  and  receiving  information  (information  tran..xTr 
standards  such  as  Internet  Protocol  suite),  for  understanding  the  information 
(information  content  and  format  standards  such  as  data  elements,  or  image 


11 


interpretation  standards)  and  for  processing  that  information.  The  JTA  also 
includes  a  common  human-computer  interface  and  "rules"  for  protecting  the 
information  (i.e.,  information  system  security  standards). 

The  scope  of  this  initial  version  of  the  JTA  is  focused  on  Command,  Control 
and  Intelligence  systems  (to  include  sustaining  base,  combat  support  information 
systems,  and  office  automation  systems)  and  the  Communications  and  Computers 
that  directly  support  them  (C4I),  and  the  interfaces  of  those  systems  with  other 
key  assets  (e.g.,  weapon  systems,  sensors,  models  and  simulations)  to  support 
critical  joint  Warfighter  interoperability.  Future  versions  of  the  JTA  will  extend 
the  Version  1 .0  scope  from  C4I  Systems  to  include  these  other  domains. 

Increased  Need  for  Standards  to  Archive  Imagery  Records  -  Since  the  coverage  of 
both  dejure  and  defacto  standards  does  not  satisfy  the  needs  of  the  record  archiving 
community,  it  appears  that  one  solution  is  to  participate  as  a  user  in  some  of  the  standards 
committees.  This  would  probably  be  more  effective  if  done  after  the  archiving 
community  met  and  determined  its  requirements. 


12 


5.0  DoD  and  Electronic  Records  Storage 

The  Department  of  Defense  and  the  National  Archives  and  Records  Administration  face  a 
difficult  set  of  decisions  on  how  to  preserve  electronic  records.  Because  NARA 
directives  already  select  ASCII  and  EBCDIC  for  text  data,  the  greatest  need  is  to  select 
acceptable  standards  for  imagery  records.  No  standards  were  designed  expressly  for 
archiving,  so  the  question  now  is  are  any  imagery  standards  suitable  for  archiving? 
Recent  DoD  accessions  at  NARA  include  digital  records  from  the  Vietnam  War.  Prior  to 
transfer,  DoD  retained  these  internally  for  a  number  of  years.  Because  these  data  sets 
contain  no  images,  they  could  be  delivered  in  ASCII  or  EBCDIC  format. 

Reviewing  the  materials  stored  at  NARA,  by  bulk  most  are  in  image  format,  either  as  still 
photos,  audio-visual  tapes,  or  technical  drawings.  For  the  moment  this  is  not  a  problem, 
since  the  overwhelming  majority  of  NARA  accessions  are  non-electronic.  Even  the 
current  accessions  of  photos  are  overwhelmingly  film  based,  and  not  digital.  The 
problem  is  the  pace  of  change  to  the  electronic  environment. 

Originals  and  Copies.  Despite  the  current  situation,  the  potential  electronic  input  is 
increasing  exponentially.  Two  types  of  documents  are  likely  to  surface,  those  originally 
in  electronic  format,  and  those  copied  into  it.  Those  originally  in  electronic  format 
include  items  like  satellite  images,  geographic  information  systems  data,  digital  photos, 
operations,  personnel  and  medical  images,  and  now  the  legacy  includes  manuals.  Those 
copied  into  electronic  formats  include  redacted  and  enhanced  images.  The  documents 
originally  stored  in  electronic  format  fit  the  traditional  venue  of  archiving,  that  is  saving 
the  document  in  its  native  format  or  as  close  as  practical.  The  latter  are  the  result  of 
special  requirements  and  technology  advances. 

Varieties  of  Standards.  Documents  produced  by  the  Department  of  Defense  use  a  wide 
variety  of  standards.  For  example,  in  the  Air  Force  most  technical  publications,  including 
engineering  drawings,  are  available  in  PDF  format  for  access  from  the  web. 
Administrative  publications  are  available  in  SGML.  Administrative  publications 
generally  start  as  documents  produced  in  MSWord,  then  are  placed  in  SGML  prior  to 
publication.  For  dissemination  on  the  web  they  are  released  in  PDF.  Graphics  in  the 
documents  are  not  imbedded,  rather  they  are  linked  using  GIF,  TIFF  or  JPEG  standards. 
The  selection  of  one  standard  is  difficult,  since  the  production  often  involves  several 
standards  in  a  single  document. 

Several  types  of  data  sets  offer  enormous  potential  for 

research,  with  appropriate  access.  Satellite  images  using 

raster  formats  give  researchers  advantages  over  the  old 

photographic  images.  For  example,  at  one  time 

cloud  cover  interrupting  terrain  in  satellite  photos  bothered 

imagery  and  by  applying  mathematical  formulas  derive  valuable  climatic  d,...,  from  the 


13 


cloud  cover  itself.  Also,  for  those  wanting  specific  locales  the  ability  exists  to  scan 
through  these  remotely  sensed  images  and  gather  an  incredible  amount  of  data  based  on 
the  spectral  signatures  or  other  elements. 

Recent  Accessions.  The  largest  collection  transferred  from  DoD  to  NARA  between  1993 
and  1996  was  of  mapping,  charting  and  geodesy  photo  images  from  the  Defense 
Intelligence  Agency.  Despite  the  value  and  resolution  of  aerial  photography,  remotely 
sensed  digital  imagery  has  largely  displaced  photo  images.  Digital  images  form  a 
valuable  database  for  Geographic  Information  Systems  (GIS).  The  GIS  needs  include 
being  able  to  strip  out  specific  data  from  the  original  images.  As  an  example,  using 
infrared  imaging  scientists  can  select  specific  spectral  signatures  showing  the  reflectance 
of  rice  plants,  and  by  applying  mathematical  models  detect  the  overall  health  of  the  plants 
and  predict  crop  returns.  Mapping  soil  types,  measuring  snow  cover  and  working  with 
climatic  models  they  forecast  stream  runoff,  moisture,  groundwater,  and  subsequently 
predict  terrain  passage  capabilities  for  an  armed  force.  Researchers  and  scientists  value 
the  old  data  for  test  and  comparison. 

Medical  records  have  undergone  significant  change  toward  digitization.  Service  member 
medical  records  contain  many  more  digital  images.  Physicians  find  MRIs  and  CAT  scans 
invaluable  in  treating  patients.  Storing  them,  though,  is  a  difficult  mix  of  varied  methods, 
rapid  change,  and  lack  of  consistency  in  standards  between  different  users.  Physicians 
using  the  best  available  tools  are  often  tied  to  a  proprietary  specification  and  machine, 
designed  for  immediate  use,  not  ease  of  archiving. 

Photographic  images  have  changed.  Some  photos  from  the  Somalia  intervention  were 
digital,  and  at  the  time  viewers  in  the  Pentagon  could  view  a  picture  taken  in  Somalia, 
transmitted  via  satellite,  downloaded,  and  printed  in  less  than  a  half  hour.  While  such 
demonstrations  are  impressive,  the  use  of  the  digital  photos  remains  limited.  The 
standards  used  for  much  of  the  DoD  photo  imagery  are  JPEG  based,  and  while  widely 
accepted,  its  compression  algorithm  is  lossy.  That  is  to  say  it  loses  information 
permanently  when  compressed,  making  it  less  desirable  for  archiving.  Newer  methods 
being  tested  by  Eastman  Kodak  and  others  offer  more  potential  for  the  near  future. 

Operational  Requirements.  In  the  operational  world  vendor  and  military  will  continue 
to  develop  software  specifically  designed  for  individual  weapons  systems.  The 
immediate  demands  and  the  lack  of  commercial  application  to  many  of  these  systems 
make  the  use  of  commercial-off-the-shelf  products  impractical.  Often  of  unique  design, 
they  are  more  difficult  to  store.  Overall  this  is  not  a  major  problem  in  many  areas,  for 
while  a  specific  system  may  be  quite  important  to  a  warfighter,  the  end  product  or  result 
may  not  be  among  that  three  percent  of  items  selected  for  preservation  at  the  National 
Archives.  Logistics  databases  are  different,  since  they  often  relate  to  expenditures,  an 
item  of  interest  to  many.  The  storage  of  these  non-image  related  databases  has  already 
begun  within  current  NARA  guidelines. 


14 


How  Non-NARA  Archival  Agencies  Use  Standards.  A  survey  of  agencies  revealed 
many  standards  questions  and  few  answers  for  archiving.  For  many,  the  real  matter  is 
providing  access  for  the  useful  life  of  the  data,  while  recognizing  the  difficulty  of  long 
term  preservation.  For  example,  the  Library  of  Congress  is  using  JPEG  for  images,  but 
recognizes  that  this  is  not  the  answer  for  long-term  preservation  because  of  the  lossy 
nature  of  the  compression  algorithm.'  The  Australian  Archives  pursues  an  active 
program  to  come  to  agreement  on  archiving  standards,  and  to  date  have  found  the 
challenges  impressive,  the  solutions  elusive.  The  Canadian  Archives  attempted  to  find  a 
standard  for  records  management,  and  they  deny  recent  tentative  announcements  of  such 
a  standard.  In  the  United  States,  a  team  chaired  by  Don  Sawyer  at  the  National  Air  and 
Space  Administration  (NASA)  is  developing  an  archiving  standard,  but  they  aim  more  at 
archiving  scientific  databases,  and  less  at  images.  This  work  spins  from  the  publication 
Preserving  Scientific  Data  on  Our  Physical  Universe,  by  the  National  Research  Council. 

Within  the  Department  of  Defense  the  Center  for  Army  Lessons  Learned  (CALL)  at  Fort 
Leavenworth,  Kansas  functions  as  a  test  bed  for  preserving  materials  for  the  Department 
of  Defense.  Using  Excalibur,  a  commercial  format,  they  are  currently  accessioning, 
cataloging,  and  preserving  material;  allowing  information  accessibility  to  Army  and  DoD 
sources.  An  ambitious  project,  this  could  be  a  good  building  block  for  an  intermediate 
effort  by  the  Department  of  Defense. 

Digital  Libraries.  One  intermediate  solution  proposed  is  the  development  of  a  digital 
library.  While  new  archival  standards  and  techniques  develop,  this  would  allow  access  to 
materials.  Just  as  the  Federal  Records  Centers  provide  an  intermediate  step  for  paper 
archives  going  to  the  Archives,  the  digital  library  could  perform  such  a  function  prior  to 
turning  electronic  data  over  to  NARA.  At  the  workshop  Floward  Besser  enlightened  on 
the  uses  of  a  digital  library  within  organizations. 

Mr.  William  Crocca  of  the  Xerox  corporation  also  discussed  digital  libraries,  noting  that  a 
digital  library  is  much  more  than  a  collection  of  electronic  documents.  To  make  the 
library  useful,  it  must  be  managed  to  achieve  a  balance  of  access  and  preservation.  Four 
key  areas  to  consider  are  collection  management,  stewardship,  navigation  and  access,  and 
distribution. 


•  Collection  management  involves  document  acquisition  (copyright 
permissions),  collection  sharing  and  pruning,  and  very  long  term  preservation  and  access. 

•  Stewardship  means  running  the  library  as  a  business,  with  an  eye  on  cost 
reduction,  revenue  streams  (contract  search  and  delivery),  and  value  leveraging. 

•  Navigation  and  access  refers  to  such  things  as  card  catalog  and  full  text  search, 
custom  anthologies,  and  interlibrary  loans. 

•  Distribution  means  having  a  fair  use  policy,  tracking  royalty  payments, 
granting  rights  and  control,  and  printing  and  distributing  documents. 


15 


6.0  SPECIFIC  STANDARDS  ISSUES  DISCUSSED  AT  WORKSHOPS 


To  support  the  contractor  team  in  gathering  the  information  for  this  analysis  the 
Department  of  Defense  sponsored  two  groupware  workshops.  These  were  facilitated  by 
the  Operations  Process  Improvement  Office  (OPIO)  of  the  Defense  Information  Systems 
Agency  (DISA)  on  October  30-31,  1996  and  March  4-6.  1997.  Several  objectives 
focused  the  session  for  a  group  of  participants  representing  all  the  branches  of  the 
Service,  several  agencies,  and  the  archival,  standards,  and  records  management 
communities.  (For  Workshop  Products  and  Information  See  Appendix  B)  These 
objectives  included;  1.  Familiarize  participants  with  present  and  future  DoD  imaging 
requirements  at  NARA.  2.  Review  the  state-of-the-art  electronic  archiving  capability.  3. 
Identify  current  standards  and  proposed  standards  for  archiving  electronic  images.  4. 
Discuss  the  costs  associated  with  storage  and  retrieval  of  imagery  information  and  the 
relative  costs  associated  with  the  imagery  standards  proposed.  5.  Make 

recommendations  regarding  a  limited  set  of  imaging 
standards  that  will  best  satisfy  both  DoD’s  and 
NARA’s  major  imaging  needs  at  a  manageable  cost. 

Achieving  the  Objectives.  The  group  achieved  the 
first  three  objectives,  all  descriptive  in  nature,  vsdth 
ease,  but  the  latter  two,  involving  decisions  and  analyzing  data,  were  more  difficult.  In 
identifying  current  and  proposed  standards  for  archiving  records  the  group  set  certain 
criteria  any  format  must  meet.  Unfortunately,  few  existing  formats  could  meet  the 
desired  criteria,  and  in  follow-up  work  the  contractor  team  could  only  identify  one 
standard  that  would  clearly  meet  the  requirements  of  the  group,  a  prohibitively  expensive 
standard  to  implement  throughout  the  department.  Costing  proved  even  more  elusive  to 
the  group  because  of  the  difficulty  associating  an  expense  to  particular  standard.  Certain 
costs  were  discussed  and  presented  at  the  meeting  and  a  more  comprehensive  collection 
gathered  later,  but  they  are  only  a  generalized  view,  and  do  not  equate  with  a 
sophisticated  collection  technique  such  as  a  functional  economic  analysis. 

External  Participants.  Though  Appendix  B  contains  the  list  of  participants,  several 
representatives  from  outside  of  the  Department  of  Defense  who  gave  presentations 
deserve  special  recognition.  In  alphabetical  order.  Dr.  Bruce  Ambacher,  National 
Archives  and  Records  Administration,  addressed  NARA's  needs.  Dr.  Howard  Besser  of 
the  University  of  California,  Berkeley  School  of  Information  Management  and  Systems 
addressed  the  issues  of  electronic  imagery  preservation  and  the  use  of  digital  libraries. 
Mr.  George  Carson  of  GSC  Associates  shared  his  background  in  developing  standards 
and  provided  insight  into  how  the  standards  developing  process  functions.  Mr.  William 
T.  Crocca  of  Xerox  Corporation  presented  information  on  how  the  private  sector  plans  to 
archive  digital  imagery  in  the  future.  The  participation  of  these  experts  assured  the 
workshop  of  having  added  variety  and  of  expertise  in  areas  not  readily  available  in  the 
Department. 


16 


Workshop  Responses 


In  the  second  workshop  the  group  looked  at  the  issue  of  image  standards  and  discussed 
the  needs  of  the  Department  of  Defense  and  the  group's  perceptions  of  what  would  be 
most  beneficial  to  NARA.  These  responses  group  into  five  areas:  General  Principles; 
Standards;  Access;  Migration  Issues;  and  Cost. 

General  Principles.  Under  General  Principles  the  group  readily  agreed  that  NARA  and 
DoD  must  agree  on  the  standards  used  in  images  provided  to  NARA.  They  recognized 
that  DoD  possesses  a  large  volume  of  legacy  documents,  most  of  which  currently  reside 
in  paper,  that  they  need  to  accession  to  NARA.  These  documents  will  be  converted  into 
images  by  Executive  Order  and  the  DoD  wants  to  submit  these  documents  to  NARA  in  a 
format  agreeable  with  both  parties.  Across  the  image  spectrum  there  is  no  “ideal”  format, 
rather  different  formats  for  different  types  of  images.  Images  do,  though,  come  in  at  least 
four  basic  categories  of  formats  that  are  important  for  archiving.  These  include  simple 
raster  formats  (such  as  TIFF);  Portable  Composed  Document  Formats  (such  as  PDF); 
Styled  SGML  (such  as  HTML),  and  SGML. 

Standards.  For  archiving,  the  group  preferred  non-proprietary,  widely  implemented 
international  standards  supported  by  market  place  and  conformant  products. 
Implementation  of  international  standards  should  require  conformance  testing  to  insure 
interoperability.  International  standards  also  may  require  profiling,  and  it  may  be 
necessary  for  NARA/DoD  to  agree  on  one  interpretation.  Also,  since  formats  change 
over  time,  and  the  life  cycle  of  such  changes  is  five  to  ten  years,  there  is  a  need  to 
recognize  that  formats  be  flexible.  Moreover,  when  information  exists  in  several  types  of 
formats  (composed,  revisable,  or  published)  then  each  of  these  formats  is  a  candidate  for 
archiving. 

Access.  Access  of  image  records,  as  with  other  records,  is  important.  Images  should  be 
identified  and  accessible  (search,  retrieve,  and  view).  Where  possible,  metadata  and  other 
descriptive  information  that  aids  access  (such  as  documentation,  finding  aids,  collection 
guides,  full  text  (such  as  ASCII  representation)  and  indexing)  will  be  transferred  with 
each  accession  in  compliance  with  the  DoD  Data  Dictionary  and  to  the  extent  possible, 
adhere  to  image  and  other  information  attribute  convention.  NARA  may  need  to  accept 
images  in  more  than  one  type  of  format,  depending  on  image  use.  An  example  of  this  is 
original  documents,  accompanied  by  redacted  images;  or  photos  accompanied  by  clearer 
enhanced  photos,  or  even  geographical  information  systems  maps  deriving  material  in 
overlays  from  an  original. 

Migration  Issues.  Image  formats  in  use  today  will  need  to  migrate  over  time.  The 
estimated  cycle  time  is  five  to  ten  years,  but  could  accelerate.  Migration  methods  should 
minimize  the  loss  of  information  or  functionality.  Preserve  both  style  and  content  if 
possible,  but  there  may  be  tradeoffs  between  them. 


17 


Costs  of  Using  a  Particular  Standard.  Cost  is  a  vital  issue.  DoD  must  justify 
implementation  costs  for  each  standard.  Present  acquisition  procedures  fail  to  account  for 
life  cycle  costs  such  as  conversion  for  accession,  media  refreshment,  migration,  and 
access  to  information.  Without  this  changing  the  methods  will  continue  to  give  an 
inaccurate  result. 

The  group  considered  the  costs  of  using  the  different  standards  quite  important,  but  the 
data  is  difficult  to  assemble.  An  attempt  by  the  Yale  University  Library  in  its  Project 
Open  Book  demonstrated  the  difficulty.  Variables  change  with  great  speed,  and  there  is 
little  agreement  among  proponents  of  varying  views  on  the  importance  of  any  one  item. 
For  example,  the  rate  at  which  the  costs  or  timing  of  CD-ROMs,  tapes,  software, 
migration  systems,  or  installation  of  new  standards  will  go.  are  subject  to  extreme 
variability.  Providing  good  statistics  to  build  on  is  important.  ,  and  the  utter  futility  to 
date  of  estimating  something  even  ten  years  in  the  future  brings  many  current  attempts 
into  question. 

Even  basic  costs,  such  as  those  for  publishing  a  page  in  SGML  vary.  The  FY  1997 
contract  rate  at  the  Defense  Automated  Printing  Service  is  $4.75  -  $7.25  per  page,  versus 
a  cost  of  $2.75  -  $4.00  per  page  in  PDF.**  (See  Figure  7)  Unanswered  in  the  current  cost 
is  what  the  usefulness  of  these  formats  will  be  tomorrow,  how  the  costs  will  change,  or 
how  accessing  systems  will  vary.  Cost  remains  a  critical,  but  difficult  area  to  assess,  even 
with  a  formal  data  call. 

Comparison  of  Imaging  Standards  Cost 


$8.00 


$7.00 


$6.00 


$5.00 


Raster  TIFF  PDF(HC) 

Figure  7:  Comparison  of  Imaging  Standards  Cost 


HTML  SGML 


Source:  Defense  Automated  Printing  Service 


18 


The  Department  of  Defense  produces,  stores,  and  seeks  access  to  large  quantities  of 
information.  A  Defense  Automated  Printing  Service  project  converting  DoD 
Specifications  and  Standards  involved  600,000  pages  to  SGML  and  700,000  pages  to 
indexed  raster  files.  An  Air  Force  conversion  of  Technical  Orders  to  SGML  was  200,000 
pages.  The  Navy  converted  to  indexed  raster  12,600,000  pages  of  Technical  Manuals. 
For  its  Technical  Manuals  the  Army  opted  for  a  PDF  format  for  1,000,000  pages.  The 
engineering  drawings  converted  to  raster  format  involved  4,100,000  pages  from  the 
Defense  Logistics  Agency  and  13,855,000  pages  from  the  Navy.  Army  Archives  sent 
200,000  pages  for  indexed  raster  conversion.  These  are  only  a  small  sampling  of  the 
items  being  stored  in  electronic  format  within  the  Department  of  Defense.^ 

Consider  the  Defense  Finance  and  Accounting  Service  projections.  They  will  have 
1,500,000  contracts  a  year  with  up  to  700  pages  per  contract.  They  need  to  save  for  a 
scheduled  period  of  time  an  annual  production  of  300,000  personal  property  government 
bills  of  lading  (GBL),  1,600,000  freight  GBLs,  and  40,000,000  vouchers.  As  the  systems 
shift  to  an  electronic  system,  the  method(s)  and  cost  of  preserving  the  contractual 
information  become  more  difficult  to  determine. 

Imagery  Standards 

These  considerations  came  from  an  extensive  discussion  of  potential  standards  used  in  the 
Department  of  Defense  and  the  possibility  of  using  them  for  archiving  image  records. 
One  of  the  discussion  items  was  the  type  of  standards  used  and  how  they  meet  the 
requirements  sought.  A  number  of  standards  were  presented  and  later  cut  to  a  short  list  of 
those  that  met  criteria  established  by  the  workshop  group. 

While  a  true  document  management  system  can  handle  any  file  in  its  native  format,  there 
are  situations  where  one  file  format  is  best  for  archiving.  Considerations  for  selecting  file 
formats  include  file  size,  image  quality,  and  whether  the  files  need  to  be  black  and  white 
or  color,  3-D  or  2-D,  or  editable  or  read-only.  There  are  many  formats  to  choose  from 
when  archiving.  Raster  and  vector  formats  that  follow  illustrate  some  of  the  choices 
available.  These  formats  are  in  order  according  to  DoD  policy  with  JTA  mandated 
standards  first,  TAFIM  adopted  standards  next,  and  other  standards  third  if  no  mandated 
or  adopted  standards  are  available. 

Raster 

JPEG.  Joint  Photographic  Experts  Group  (JPEG)  is  a  standard  color  compression  file 
format  mandated  by  the  JTA  and  TAFIM.  It  is  the  most  used  color  compression  format, 
supported  by  most  Web  browsers  and  used  on  many  networks.  Users  of  JPEG  must 
decide  whether  to  use  the  progressive,  sequential,  baseline,  or  lossless  format.  In 
addition,  compression  to  JPEG  loses  some  color  information,  though  a  color  compression 
toolkit  can  recalibrate  the  image  and  correct  the  exposure.  Because  of  the  popularity  of 
JPEG,  it  is  a  fairly  safe  choice  for  long-term  storage. 


19 


CALS.  Continuous  Acquisition  and  Lifecycle  Support  (CALS)  compression  is  a 
standardized,  black  and  white,  raster  format.  It  is  adopted  by  the  TAFIM,  but  not 
mandated  by  the  JTA.  Future  versions  of  the  JTA  expect  to  address  engineering  and 
technical  data  standards  such  as  CALS.  Owned  by  the  CALS  Management  Support 
Office  of  the  U.S.  Department  of  Defense,  this  file  type  used  by  the  military  probably 
have  long  term  viability.  CALS  uses  CCITT  III  and  IV  compression. 

GIF.  The  Graphics  Interchange  Format  (GIF)  is  an  extremely  stable,  lossless  color 
format  that  is  fully  backwards  compatible.  (Lossless  compressed  images  retain  their 
integrity  down  to  the  pixel.  Lossy  compressed  images  use  a  difference  algorithm  that 
drops  some  image  integrity.)  Designed  for  color,  GIF  is  an  8-bit  format  where  each  pixel 
represents  one  of  256  colors.  Because  of  this,  converting  files  to  grayscale  does  not 
lower  the  file  size.  GIF  is  best  for  lower  resolution  files  where  there  are  large  areas  of 
like-colored  pixels.  Although  widely  used,  it  is  not  mandated  by  the  JTA  or  TAFIM 
because  it  contains  a  patented  algorithm  for  which  the  patent  holder  is  charging  royalties. 
JPEG  provides  a  publicly  held  and  viable  alternative  with  strong  marketplace  support. 

TIFF.  Originally  created  in  1986,  the  flexible  Tag  Image  File  Format  (TIFF)  has  become 
a  widely  used  de  facto  specification.  Unfortunately,  TIFF  is  so  flexible  in  allowing 
designers  to  create  their  own  tags  that  not  all  applications  can  read  all  TIFF  files.  In 
addition,  TIFF  files  may  compress  using  one  of  several  different  compression  formats 
(including  JPEG  compression  and  CCITT  Group  III  and  IV)  or  they  may  remain 
uncompressed.  Not  approved  by  a  recognized  standards  body,  TIFF  is  not  included  in 
the  JTA  or  TAFIM.  It  is,  however  widely  used  in  DoD  component  publications,  and  is 
planned  as  the  basis  for  the  scanning  and  redaction  of  one  billion  pages  of  classified 
documents.  NARA  also  uses  it  for  access  to  documents  on  the  internet,  but  not  for 
preservation.  Even  on  the  matter  of  using  TIFF,  NARA  uses  version  4  and  DoD  uses 
version  6  for  the  internet,  and  the  two  are  not  interchangeable. 

Vector 

SGML.  Standard  Generalized  Markup  Language  is  a  formal  standard  defined  by 
ISO/IEC  8879,  FIPS  152,  and  MIL-M-28001B.  SGML  is  a  meta-language  that  allows 
users  to  define,  in  machine-readable  form,  the  structure  and  content  of  any  class  of 
documents.  The  standard  specifies  a  method  for  creating  document  hierarchy  models  in 
which  every  element  in  a  document  fits  into  a  logical,  predictable  structure. 

SGML  is  able  to  separate  the  logical  and  physical  structure  of  text.  In  this  way,  the 
standard  is  able  to  distinguish  between  the  role  of  piece  of  text  (e.g.,  caption,  title, 
chapter,  index)  and  its  appearance  (e.g.,  type  face,  font,  size,  margin).  This  permits  text 
to  be  tagged  with  descriptive  markup,  enhancing  its  functionality.  By  providing  the 
ability  to  associate  processing  instructions  with  document  markup,  SGML  includes  a 
mechanism  for  referencing  nou-text  forms  within  a  document.  By  providing  tags  that 
enable  query  and  hypertext  capabilities,  SGML  is  a  standard  that  allows  the  production  of 
intelligent  documents  for  distribution  and  use  on  CD-ROM  and  other  random  access 


20 


media.  The  SGML  standard  is  useful  to  organizations  that  exchange  information  between 
systems,  applications,  departments,  and  users. 

CGM.  Computer  Graphics  Metafile  (CGM)  is  a  standard  color  file  format,  primarily 
oriented  towards  stroke-drawn  graphics,  such  as  polylines  and  filled  polygons,  though  it 
also  supports  raster  bitmap  encodings.  CGM  may  or  may  not  be  compressed.  It  became 
an  international  standard  in  1987  and  is  the  national  standard  for  many  countries, 
including  the  U.S.  and  U.K.  The  U.S.  Department  of  Defense  adopted  CGM  for  their 
major  CALS  open  documentation  initiative.  The  JTA  and  TAFIM  mandate  use  of  CGM. 
Since  the  CGM  format  is  very  complex,  all  applications  may  not  support  all  entities. 

IGES.  The  Initial  Graphic  Exchange  Specification  (IGES)  format  was  designed  as  an 
exchange  format.  Used  primarily  for  3-D  images,  IGES  files  tend  to  occupy  a  great  deal 
of  storage  space.  IGES  selection  derives  from  its  independence  from  any  application 
and  its  support  by  many  high-end  CAD  and  modeling  programs.  Like  the  DXF  format, 
IGES  bridges  the  gap  across  applications,  allowing  users  to  save  a  CAD  file  as  IGES, 
then  load  it  directly  into  a  graphics  or  3-D  modeling  application.  CALS  also  supports 
IGES. 

Products  can  convert  IGES  files  to  Computer  Graphics  Metafiles  (CGM).  IGES  is  used 
primarily  in  the  mechanical  CAD  sector,  whereas  CGM  has  propagated  itself  across  the 
board  into  diverse  sectors  including  mechanical  and  electrical  CAD,  geophysical 
exploration,  GIS,  desktop  publishing,  and  presentation  graphics.  This  means  that  IGES 
files  can  convert  to  CGMs  for  importation  into  any  publishing  system  or  produced  as 
hard  copies  outside  the  CAD  system. 

In  1996,  IGES  was  revised  and  redesignated  as  Digital  Representation  for 
Communication  of  Product  Definition  Data  (ANS/USPRO/IPO  100-1996).  Adopted  by 
the  TAFIM,  it  is  not  included  in  the  JTA. 

STEP.  Standard  for  the  Exchange  of  Product  Model  Data  (ISO  10303)  provides  a 
representation  of  product  information  along  with  the  necessary  mechanisms  and 
definitions  to  enable  product  data  to  be  exchanged.  The  exchange  is  among  different 
computer  systems  and  environments  associated  with  the  complete  product  lifecycle 
including  design,  manufacture,  utilization,  maintenance,  and  disposal. 

The  overall  objective  of  STEP  is  to  provide  a  mechanism  that  is  capable  of  describing 
product  data  throughout  the  life  cycle  of  a  product,  independent  from  any  particular 
system.  The  nature  of  this  description  makes  it  suitable  not  only  for  neutral  file 
exchange,  but  also  as  a  basis  for  implementing  and  sharing  product  data  bases  and 
archiving.  In  the  long  term,  STEP  will  probably  replace  IGES  because  it  covers  the 
entire  product  lifecycle.  A  strategy  to  migrate  from  IGES  to  STEP  is  being  developed  by 
the  IGES/PDES  (Product  Data  Exchange  Specification)  Organization.  Adopted  by  the 
TAFIM,  STEP  is  not  included  in  the  JTA. 


21 


PDF.  A  direct  subset  of  PostScript,  the  Portable  Document  Format  (PDF)  is  more 
compact  and  has  a  faster  processing  time.  It  also  has  some  capability  for  hyperlinking 
and  is  supported  by  more  applications  than  PostScript.  Generally  used  for  textual 
information  only,  PDF  files  can  support  graphics.  PDF  uses  a  page  definition  language, 
as  opposed  to  image  definition,  making  the  files  bulky.  As  with  PostScript  files,  PDF 
files  are  not  editable.  The  JTA  mandates  that  all  organizations  be  capable  of  reading  and 
printing  documents  in  PDF  format. 

HTML.  HyperText  Markup  Language.  HTML  is  an  informal  Internet  standard  defined 
by  RFC  1866.  HTML  consists  of  a  set  of  tags  that  conform  to  SGML  rules  and 
conventions.  The  HTML  tag  set  can  be  used  as  the  basis  to  define  a  DTD  (Data-Type 
Definition)  that  is  consistent  with  SGML  syntax.  By  defining  HTML  in  an  SGML  DTD, 
HTML  becomes  an  SGML  application. 

The  HTML  document  type  contains  relatively  general  semantics  for  representing 
information  for  linking  of  data  and  document  with  a  limited  SGML  tag  set  and  limited 
formatted  capability.  Moreover,  simplicity  was  the  guide  in  development  so  that  multiple 
browsers  and  editors  could  be  used  on  multiple  platforms.  The  following  list  gives  some 
idea  of  the  specific  uses  available:  hypertext  news,  mail,  on-line  documentation,  menus  of 
options,  database  query  results,  and  simply  structured  documents  with  in-line  graphics. 
HTML  has  the  capability  to  allow  networked  hypertext  to  use  text,  sound,  movie,  and 
images  in  a  variety  of  formats. 

DXF.  Data  Exchange  Format  (DXF)  is  a  color,  proprietary  format  of  Autodesk.  It  is 
supported  by  most  CAD-based  and  3-D  graphics  programs  and  has  become  a  standard, 
but  after  years  of  development,  old  formats  may  or  may  not  be  supported.  One  of  its 
current  uses  is  to  bridge  the  gap  from  one  application  to  another.  For  example,  a  file  can 
be  saved  as  a  DXF  in  AutoCAD,  then  loaded  in  3-D  Studio  or  another  graphics  program. 
As  an  ASCII-based  format,  DXF  files  tend  to  be  bulky. 

HPGL.  Developed  for  pen  vector  plotters,  Hewlett  Packard  Graphics  Language  (HPGL) 
files  are  a  vector-based  system  of  lines.  They  save  time  because  many  printers  and 
plotters  can  read  them  directly,  meaning  you  can  copy  them  to  the  printer  or  plotter 
without  loading  them  into  an  application  first.  HPGL/2  files  are  a  binary  representation 
of  HPGL  with  new  commands  making  it  faster  and  more  compressed  than  HPGL. 
HPGL/2  files  can  also  have  TIFF  files  embedded. 

PostScript.  PostScript  files  are  bulky  but  they  are  an  industry  standard.  These 
uneditable  files  present  a  time-saving  alternative  to  scaiming  documents,  and  like  HPGL 
files,  any  PostScript  printers  can  receive  Post  Script  files  directly.  Unfortunately,  not 
many  applications  can  directly  import  PostScript  files. 

Neither  the  JTA  nor  the  TAFIM  includes  DXF,  HPGL,  and  PostScript  formats. 

Other 


22 


There  are  other  vector  formats,  such  as  DWG,  ME  10,  and  CADKey,  but  these  are 
proprietary  and  may  be  more  likely  to  change  or  be  discontinued.  If  choosing  a 
proprietary  file  format,  an  agency  may  want  to  archive  to  native  application  to  assure  the 
ability  to  view  and  edit  the  files,  even  if  the  application  becomes  obsolete  or 
discontinued.  This  will  also  ensure  that  there  is  no  conversion  loss  from  translation. 

Another  alternative  is  to  use  a  viewing  application.  Viewing  applications  load  many 
different  file  formats  including  older  formats,  and  often  include  other  features  such  as 
redlining  capability  and  ISO  9000  compliance  banners.  Many  of  these  applications  can 
integrate  with  a  database  or  document  management  software  for  fast  file  access. 


Table  2.  Characteristics  of  Selected  Image  Formats 


FILE 

FORMAT 

RASTER 

VECTOR 

3-D 

COLOR 

EDITABLE* 

COMPRESSED/ 

UNCOMPRESSED 

PROPRIETARY 

CADKey 

X 

X 

X 

X 

Uncompressed 

X 

CALS 

X 

X 

Compressed 

CGM 

Rarely 

X 

X 

X 

Either 

DWG 

X 

X 

X 

X 

Uncompressed 

X 

DXF 

X 

X 

X 

X 

Uncompressed 

X 

GIF 

X 

X 

X 

Compressed 

HPGL 

X 

X 

Uncompressed 

IGES 

X 

X 

X 

X 

Uncompressed 

JPEG 

X 

X 

X 

Compressed 

MEIO 

X 

X 

X 

X 

Either 

X 

PDF 

X 

X 

Uncompressed 

PostScript 

X 

X 

Uncompressed 

TIFF 

X 

X 

X 

Either 

Compression  is  also  a  key  factor  in  archiving.  Users  may  want  to  zip  files  to  attain 
maximum  storage  capability  in  their  archiving  system.  Though  there  are  several 
compression  packages  available,  PKZIP,  the  standard  for  nearly  a  decade,  is  consistently 
backwards  compatible.  Some  formats,  such  as  GIF  and  JPEG,  are  already  compressed, 
but  most  vector  formats  are  not.  To  archive  thousands  of  files,  however,  compressing 
each  one  can  be  a  time-consuming  and  costly  process. 

What  are  Some  Future  Standards? 

SPIFF.  Still  Picture  Interchange  File  Format.  SPIFF  is  the  “official”  JPEG  file  format. 
Part  3  of  the  JPEG  standard  (ISO  10918)  now  includes  a  fully  defined  file  format  for 


23 


storing  JPEG  data.  When  the  JPEG  format  standardized,  disagreements  among  ISO 
committees  prevented  creation  of  a  standard  JPEG  file  format.  The  defacto  format  that 
appeared  was  JFIF  from  C-cube  Microsystems.  The  JFIF  format,  although  now  quite 
wide-spread,  is  very  limited  in  capability  as  file  formats  go.  JFIF  is  currently  mandated 
by  the  JTA. 

SPIFF  is  intended  to  replace  the  JFIF  file  format,  adding  features  (more  colorspaces,  a 
recognized  way  of  including  text  blocks,  and  so  forth),  and  providing  a  backwards- 
compatibility  allowing  SPIFF  files  to  be  read  by  most  JPEG/JFIF  decoders.  JFIF, 
however,  has  a  five-year  head  start  on  SPIFF,  so  the  likelihood  of  a  rapid  replacement  is 
not  good. 

FlashPix.  Designed  by  a  consortium  of  Kodak,  Hewlett  Packard,  Live  Picture,  and 
Adobe,  FlashPix  is  planned  to  replace  Photoshop  5.0.  It  contains  a  robust  set  of  metadata 
tags  and  may  become  a  new  de  facto  specification  because  of  the  influence  of  the 
developers. 

Xerox  Information.  Xerox  Corporation  has  studied  some  of  the  problems  of  document 
storage  and  Mr.  William  Crocca  presented  some  information  on  Digital  Libraries  and 
possible  future  directions.  Possible  future  directions  as  seen  by  Xerox  involve  a  mix  of 
developments  in  hardware  and  software  technology.  Some  key  points  made  by  Mr. 
Crocca  were: 

•  Leverage  on-going  research  in  electronic  archiving 
•Disciplined  adoption  of  newer  storage  media 

•  More  integration  with  transaction  systems 

•Richer  linguistic  tools  that  allow  better  navigation  for  patron-sought  concepts 

•  Knowledge  management  to  provide  richer  yet  more  focused  searching  and 
focused  data  delivery 

•Incorporation  of  optical  search/pattem  recognition  such  as  in  graphics  and 
photographs 

Classes  of  Imagery  Standards 

Standards  and  specifications  can  be  identified  by  their  roles  in  the  document  life-cycle. 
One  way  of  describing  these  roles  is  to  use  the  four  main  standards  categories  of  content, 
structure,  presentation,  and  distribution.  This  provides  a  framework  for  discussing 
standards  and  products  by  specific  functional  areas. 

Contents  Standards  Contents  Standards  relate  to  data  representation  and  include  CGM 
as  the  vector  graphic  data  interchange  standard,  IGES  as  the  Product  Data  Exchange 
Standard,  and  ASCII  as  the  character  set  standard. 

Structure  Standards  Structure  Standards  describe  how  rules  for  document  organization 
should  be  specified.  SGML  is  the  Document  Type  Custom  Definition  Standard,  which 
specifies  the  way  to  use  a  Document  Type  Definition  (DTD)  to  represent  the  dependence 


24 


between  document  components.  Standard  for  the  Exchange  of  Product  Model  Data 
(STEP)  is  another  Product  Data  Exchange  Standard.  TIFF  is  an  informal  standard  that 
falls  into  the  structure  class. 

Presentation  Standards  Presentation  standards  ensure  that  similar  structure  contents  are 
represented  in  similar  ways,  regardless  of  the  implementation  approach.  The  most 
commonly  used  presentation  standard  is  PDF  which  allows  the  display  of  documents  in 
their  original  format.  PDF  is  currently  proposed  as  a  standard  for  the  delivery  and 
presentation  of  non-revisable  documents. 

Distribution  Standards  Distribution  standards  are  links  between  applications,  enabling 
documents  to  be  packaged  and  exchanged  between  applications  during  their  life  cycle. 
Distribution  standards  involve  media  as  well  as  format,  such  as  Optical  Digital 
Technology  Standard  for  CD-ROM. 

Another  way  of  classifying  imaging  standards  is  to  put  them  on  a  sliding  scale  from  static 
to  dynamic  based  on  the  amount  of  change  the  corresponding  document  can  undergo. 
The  following  diagram,  presented  at  the  workshop  by  Steve  Carson,  shows  various 
document  types  and  related  standards. 


Table  3.  Document  Types  and  Related  Standards 


Document 

Type 

Page 

Raster 

PCDF 

Word 

Processor 

Styled 

SGML 

SGML 

Standard 
or  Product 

ASCII, 

EBCDIC 

TIFF 

PDF, 

Envoy 

MS  Word 

WordPerfect 

HTML 

SGML 

Not  Revisable  < - >  Revisable 

Final  Form 

For  archiving  purposes,  documents  transmitted  in  ASCII  or  EBCDIC  retain  the  original 
data  but  the  page  layout  format  is  lost.  TIFF  and  PDF  also  retain  the  original  data  but 
carry  the  formatting  codes  forward.  MS  Word,  WordPerfect,  HTML,  and  SGML  are 
revisable  formats  and  are  able  to  include  images  and  formatting.  The  four  key 
products/standards  for  electronic  imaging  are  TIFF,  PCDF,  Styled  SGML  and  SGML. 

Considerations  for  the  Future.  Looking  to  the  future  the  group  session  revealed  a 
number  of  concerns.  Most  group  into  one  of  three  categories,  determine  what  can  be  or 
was  done;  determine  the  format;  and  recommend  policy  changes. 

Estimate  Volume  and  Format  of  Imagery  Records.  To  find  what  can  and  has  been 
done,  group  members  recommended  follow-up  research.  This  would  include  determining 
costs  of  using  each  standard;  survey  DoD  organizations  to  determine  volume  and  formats 
of  documents.  Conduct  a  data  call  to  gather  specific  raw  data  about  the  prioritized  list  of 
image  types  such  as  legacy  documents,  and  engineering  drawings.  The  call  should 


25 


include  better  figures  about  volume  and  standards  or  formats,  including  the  uniformity  of 
image  applications  used  throughout  DoD.  The  value  of  the  data  call  would  provide  a 
validity  and  reliability  check  on  intuitive  conclusions. 

To  determine  the  format  group  members  presented  several  ideas.  They  included 
application  of  principles  considered  earlier  in  this  chapter  be  applied  and  use  of  a  team  of 
subject  matter  experts  in  standards  to  determine  the  proper  standards  to  use.  Researchers 
would  determine  the  extent  of  the  universe  for  various  types  of  images.  They  would 
identify  requirements  including  new  profiles,  standards,  conformance  testing  and 
certification  efforts,  test  suite  generation  and  promulgation  efforts,  and  joint  industry  and 
government  initiatives. 

Determine  Cost  of  Using  Various  Imagery  Standards.  Recommending  policy 
changes,  members  suggested  that  based  on  DoD’s  requirements,  determine  the 
appropriate  electronic  image  standards  to  recommend  for  NARA  and  DoD’s  use,  and 
charter  a  trial  implementation  effort.  When  NARA  and  DoD  reach  agreement  on 
acceptance  of  image  standards  to  use,  the  principles  suggested  by  the  group  should  form 
the  basis  for  determining  the  costs,  including  access  and  migration,  and  applied  to  NARA 
and  DoD.  They  should  determine  and  cost  probable  migration  methods  and  prepare 
documents  for  inclusion  in  the  budget  to  fund  these  future  migrations.  Following  this 
DoD  should  charter  a  trial  implementation  effort.  Following  adoption  of  the  image 
standards,  current  directives  such  as  the  JTA,  TAFIM,  and  DoD  5015.2  Records 
Management  Policy,  will  need  updating  or  revision  as  required.  As  technology  advances, 
review  the  imagery  policies  and  standards  and  update  on  a  periodic  basis. 


26 


7.0  RECOMMENDATIONS 


The  group  workshops  and  the  contractor  team  between  them  came  up  with  several 
recommendations,  grouped  in  three  areas.  First,  what  standards  should  NARA  and  DoD 
adopt?  Second,  what  imagery  standards  bodies  or  industry  affinity  groups  should  DoD 
participate  with  to  assure  input  in  the  criteria  for  those  standards  or  specifications?  Third, 
what  directions  would  most  benefit  DoD  and  the  National  Archives  in  achieving  the  goals 
of  preserving  imagery  and  information  while  providing  greater  access. 

Image  Standards  That  NARA  and  DoD  Should  Adopt. 

Recommendation  1:  Adopt  SGML  as  an  archiving  standard  at  DoD  and  NARA, 
with  certain  caveats. 

Recommendation  2:  Have  a  team  of  subject  matter  experts  continue  evaluating 
other  imaging  standards,  including  proprietary  systems. 

The  second  workshop  did  not  make  a  clear  standard(s)  selection  at  its  meeting,  rather  it 
selected  criteria  to  meet.  Early  in  the  process  the  group  recognized  the  problems  of 
using  every  standard  for  archiving.  Among  the  hvmdreds  of  standards  most  simply  do 
not  have  broad  enough  usage  to  justify  the  cost  and  effort,  at  this  time,  of  preserving 
them.  After  a  thorough  review  of  the  image  standard(s)  criteria  needed  for  selection  as  an 
archiving  standard,  the  contractor  team  concluded  that  only  a  handful  could  make  the 
“short  list.”  These  included  JPEG,  JFIF,  TIFF,  PDF,  HTML,  and  SGML.  One  standard, 
SGML,  met  the  rigorous  requirements  more  than  any  other,  and  it  only  with  caveats. 
Some  that  did  not  meet  all  requirements  in  the  tentative  listing  were  retained  for  future 
review  because  of  the  difficulty  of  measuring  certain  parts  of  the  criteria,  such  as  costing, 
at  this  time.  Also,  changing  views  in  the  federal  government  on  the  use  of  proprietary 
systems  may  alter  the  view  on  their  use. 

Recognizing  that  there  is  no  one  “ideal”  format  for  all  archiving  of  publications,  the 
workshop  members  proposed  certain  criteria  for  evaluating  use  of  standards  in  archiving: 

•  Standards  selected  should  be  widely  implemented  international  standards 
supported  by  the  marketplace  and  conformant  products 

•  Where  possible,  avoid  proprietary  products 

•  International  standards  may  require  profiling 

•  When  a  document  exists  in  multiple  formats,  consider  each  for  archiving 

•  Standard  allows  migration  with  minimal  loss  of  information  or  functionality 

•  Preserve  style  and  content  where  possible 

•  Justify  cost 

•  Costs  should  include  entire  life-cycle,  including  conversion  for  accession, 
media  refreshment,  migration,  and  access  to  information. 

•  Standard  must  contain  metadata  that  support  cataloging  and  accessing 


27 


•  Desire  mature  standard,  one  around  long  enough  to  ensure  its  acceptance  and  to 
assure  that  many  will  promote  its  migration  to  new  systems 

•  Standard  should  be  stable,  with  few  major  changes  in  recent  years 


Table  4:  Standards  Relationships  Addressed  at  Workshop 


Standard 

Standards 

DoD 

Cost  per  page 

Propri- 

Widely 

ISO 

Stable 

JTA- J 

Body 

Repre^ 

(DAPS) 

etary 

Used 

Spec 

TAFIM  -  T 

sented 

Neither-N 

ASCII 

ISO 

Y 

N/A 

N 

Y 

Y 

Y 

j 

TIFF 

Microsoft 

Y 

N/A 

Y 

Y 

N 

Y 

N 

PDF 

Adobe 

N/A 

$2.75 

Y 

Y 

N 

Y 

j 

MS  Word/ 

Microsoft/ 

N/A 

N/A 

Y 

Y 

N 

Y 

J 

WordPerfect 

Novell 

HTML 

lAB 

Y 

N/A 

N 

Y 

N 

Y 

J 

SGML 

ISO 

Y 

$4.75-$7.25 

N 

Y 

Y 

Y 

J 

JPEG/JFIF 

ISO 

Y 

N/A 

N 

Y 

Y 

Y 

J 

Raster 

(pricing 

$0.16 

OCR/SCR/ 
Photo  Scan 

only) 

$1.80 

SGML  is  the  image  standard  recommended  for  adoption  by  NARA  and  DoD. 
Since  many  DoD  agencies  already  use  SGML  for  setting  up  records,  agencies  with 
operational  documents  or  their  own  archival  records  in  SGML  should  be  allowed  to 
transfer  material  to  NARA  in  SGML  format.  The  caveat  is  that  other  organizations  that 
do  not  have  records  stored  in  SGML  should  not  have  to  store  them  in  this  way  because  of 
the  cost  of  SGML.  FY  1997  costs  for  transferring  a  document  to  SGML  format  by  the 
Defense  Automated  Printing  Service  are  $4.75-$7.25  per  page.  Other  methods,  such  as 
PDF,  at  $2.75  per  page  are  far  more  cost  effective  than  SGML,  but  they  fail  to  meet 
archiving  criteria  on  other  counts.  The  following  table  summarizes  this  recommendation 
of  SGML. 


Table  5.  SGML  Characteristics 


Specification  Title 

Standard  Generalised  Mark-up  Language  (SGML), 
ISO  8879:1986 

Applicability 

SGML  is  a  mark-up  language  for  defining  the 
logical  structure  of  documents.  SGML  is  capable  of 
supporting  the  integration  of  text,  graphics  and 
scanned  images.  SGML  permits  breaking  a 

document  into  parts  storable  in  several  files.  It  is 
characterised  by  its  content  (logical  structure)  and  its 
internal  organisation  (layout  structure)  derived  from 
c  formal  type  by  means  of  a  Document  Type 
Definition  (DTD). 

28 


Level  of  Consensus 

SGML  is  an  ISO  standard  adopted  by  the  ISO 
organisation  and  the  CALS  program. 

Maturity 

The  concepts  for  SGML  are  mature 

Stability 

Within  the  current  ISO  5  year  review  process, 
minor  changes  to  SGML  are  expected  but  they  will 
be  compatible  with  the  existing  standard 

Product  Availability 

Many  SGML  products  are  available,  but  are  not  all 
conformant  nor  interoperable 

Conformance  Testing 

SGML  products  are,  now  certified  only  by  the  US 
to  ANSI  X3. 190- 1993  Standard,  but  being  adopted 
as  an  ISO  standard.  The  Conformance  Testing  for 
SGML  Systems  (under  development  by  ISO  fast 
track  from  ANSI  X3. 190-1993)  is  ISO/ICE  DIS 
13673:1994. 

The  group  also  recommended  that  the  evaluation  of  existing  standards  using  the 
workshop  criteria.  Table  4  depicts  results  of  a  preliminary  study.  The  need  is  for  a  more 
accurate  study,  particularly  in  the  areas  of  cost  and  of  viability  to  the  National  Archives. 

The  next  stage  would  be  to  assemble  a  team  of  experts  to  include  new  profiles,  standards, 
conformance  testing  and  certification  efforts,  test  suite  generation  and  promulgation 
efforts,  and  joint  industry  and  government  initiatives. 

As  technology  advances,  review  the  imagery  policies  and  standards  and  update  them  on  a 
periodic  basis.  This  would  require  a  group  to  monitor  the  program. 

Imagery  Standards  Bodies  With  Whom  DoD  and  NARA  Should  Participate 

Recommendation  3:  DoD  should  retain  representation  on  standards  bodies  and 
seek  an  advisory  role  in  proprietary  systems  that  significantly  affect  DoD. 

OMB  Circular  A-119  Federal  Participation  in  the  Development  and  Use  of 
Voluntary  Standards.  This  Circular  establishes  policy  for  executive  agencies  in 
working  with  voluntary  standards  bodies.  It  also  establishes  policy  for  executive  branch 
agencies  in  adopting  and  using  voluntary  standards.  In  short,  this  circular  recommends 
that  agencies  adopt  voluntary,  international  standards  based  on  performance  criteria 
whenever  possible.  It  also  encourages  participation  by  Federal  agency  employees  in 
voluntary  standards  bodies  and  standards-developing  groups.  Agency  rep:  mtatives 


29 


should  participate  actively,  on  a  basis  of  equality  with  private  sector  representatives  and 
not  seek  to  dominate  such  groups. 

Recognizing  this  need,  the  contractor  team  reviewed  the  current  DoD  participation  in 
standards  efforts.  In  short,  the  team  determined  that  the  Department  should  retain  its 
affiliation  with  formal  standards  organizations.  A  suggested  change  is  that  where 
feasible,  DoD  should,  as  a  prudent  customer,  work  with  developers  of  commercial 
standards  to  develop  de  facto  specifications  that  incorporate  the  Department’s  needs. 

First,  the  principal  standards  organizations  in  the  United  States.  IGES/PDES 
Organization  (IPO)  of  US  PRO  is  the  representative  body  of  individuals  that  develops 
standards  for  product  data  exchange  (PDE)  technology.  Working  in  cooperation  with  the 
ISO  through  its  U.S.  Technical  Advisory  Group  (TAG),  the  IPO  is  also  the  U.S. 
representative  for  the  development  of  the  international  standard  STEP.  The  American 
National  Standards  Institute  (ANSI)  accredited  the  IPO  as  the  U.S.  organization  to 
develop  standards  and  specifications  for  the  sharing  and  exchange  of  product  information. 

Currently  DoD  participates  in  the  major  international  and  national  imagery  standards 
bodies.  The  following  table  summarizes  these  efforts: 


Table  6.  Standards  Bodies  and  DoD  Representatives 


Standards  Body 

DoD  Representative 

Intemational(ISO) 

JTC  1  SC  24  TAG  (Computer  Graphics 
&  Image  Processing) 

Dr.  Doris  Bemardini 

JTC  1  SC  29  WG  11  (Coding  Audio, 
Picture,  Multimedia  &  Hypermedia) 

Dr.  Doris  Bemardini 

National(ANSI) 

Technical  Committee  X3H3(Computer 
Graphics  &  Image  Processing) 

Dr.  Doris  Bemardini 

Technical  Committee  X3L1  (Coding 
Audio,  Picture,  Multimedia  & 
Hypermedia 

Dr.  Doris  Bemardini 

Task  Group  X3H3.8(Image  Processing 
and  Interchange) 

LCdr  Mike  Morris 

NARA  and  Dob  records  managers  should  meet  with  the  DoD  representatives  to  the 
national  and  international  standards  bodies  listed  above  to  develop  requirements  and  a 
strategy  for  participation  in  these  standards  bodies.  Appendix  H  is  a  complete  list  of 
DoD  representatives  to  International,  National  and  Federal  Standards  Bodies. 


30 


Table  7.  Standards  Bodies  and  Represented  Standards 


Standards  Body-International(ISO) 

Standards 

JTC  1  SC  2  Coded  Character  Sets 

ASCII 

JTCl  SC  18  Document  Processing  and 
Related  Communication 

SGML 

JTC  1  SC  24  TAG  (Computer  Graphics 
&  Image  Processing) 

JPEG 

JTC  1  SC  29  WG  11  (Coding  Audio, 
Picture,  Multimedia  &  Hypermedia) 

MPEG 

The  Department  of  Defense  should  consider  changing  policy  to  allow  participation  in  the 
development  of  “de  facto  standards  or  specifications.”  The  Department  is  requesting  that 
members  seek  commercial-off-the-shelf  solutions  and  this  necessitates  working  with  the 
developers  of  proprietary  systems.  This  would  require  a  change  in  direction,  and  would 
require  DoD  to  contact  organizations  such  as  Adobe,  the  manufacturer  of  PDF,  or  the 
consortium  developing  FlashPix.  To  date,  the  commercial  sector  has  created  most  of  its 
products  without  formal  input  by  DoD. 

New  Directions  That  Benefit  DoD  and  NARA. 

The  workshop  groups  identified  several  areas  that  need  further  study  by  the  Department 
of  Defense.  These  concerned  costs,  gathering  of  information,  and  specific  changes 
needed  to  facilitate  the  preservation  of  and  access  to  digitized  records. 

Recommendation  4:  Determine  the  life  cycle  costs  of  all  potential  imaging  standards 
options  so  they  can  be  used  in  budget  planning. 

The  group  recommended  that  the  life  cycle  costs  of  each  potential  standard  be 
determined,  with  these  figures  used  in  budget  planning.  Probable  future  migration  and 
access  methods  should  be  applied  to  the  costs  to  both  NARA  and  DoD.  Documents  for 
inclusion  in  future  budget  initiatives  should  be  included.  Gathering  and  creating  accurate 
and  effective  data  is  difficult,  since  much  of  the  future  of  electronic  data  is  uncertain. 
Yale  University’s  Project  Open  Book  attempted  to  look  at  some  costs,  and  the  authors 
acknowledged  this  near  impossible,  but  necessary,  aspect  of  budget  planning. 

Recommendation  5:  Conduct  a  data  call  to  determine  the  standards  and  quantity  of 
images  used  across  DoD. 

The  group  recommended  a  formal  data  call  to  determine  with  greater  accuracy  the  present 
and  future  volume  and  formats  of  images.  The  result  would  give  greater  precision  in 
budget  planning.  Such  a  listing  does  not  exist  and  would  help  in  determining  the  present 
use  of  image  standards  and  a  properly  conducted  data  call  would  allow  a  determination  of 
the  direction  of  future  electronic  imaging. 


31 


Recommendation  6:  Evaluate  the  need  for  a  DoD  digital  library  as  an  intermediate 
step  toward  archiving  and  as  a  means  of  coordinating  ongoing  digital  library  and 
repository  programs. 


The  last  item  covered  by  the  group  was  the  possibility  of  a  digital  library  or  other 
intermediate  step  allowing  access  to  data  prior  to  archiving.  This  comes  from  the  concern 
over  how  to  provide  for  the  needs  of  the  Services  and  Agencies  while  providing  for 
NARA's  needs.  The  suggestion  goes  back  to  Figure  3  on  page  5,  where  there  is  a  clear 
sequence  of  activities  taking  non-electronic  records  from  the  Department  to  Federal 
Records  Centers  where  they  reside  for  the  duration  of  their  schedule.  They  are  then 
retired  or  moved  to  NARA.  In  the  case  of  Electronic  Records,  the  movement  is  simply 
from  DoD  to  NARA.  (See  Figure  7)  A  much  more  sensible  approach  would  be  to  move 
the  digital  records  from  DoD  into  a  digital  library.  Several  Services  and  Agencies  have 
already  started  doing  this,  and  without  coordination,  their  efforts  may  lead  to  non 
standardized,  stovepipe  systems.  A  DoD  wide  oversight  organization  could  set  up  a 
method  that  would  allow  for  standardization,  and  could  forward  on  records  within  a 
NARA  acceptable  format  to  meet  preservation  criteria.  There  would  be  no  need  to  stop 
any  departmental  effort,  rather  adjust  it  to  assure  standardization. 


Figure  7.  Proposed  Flow  of  Information  to  NARA 


32 


8.0  CONCLUSION 


The  purpose  of  this  report  was  to  conduct  a  requirements  analysis  for  electronic  records 
recording  formats  that  will  lead  to  the  selection  of  alternative  standards  for  the  storage 
and  retrieval  of  electronic  records  and  the  information  they  contain.  Using  two 
workshops  as  the  basis  for  the  source  material,  the  study  reflects  the  considerable 
progress  made  in  that  direction.  Criteria  were  applied,  some  solutions  found,  and 
directions  to  follow  to  resolve  the  remainder  established. 

Specifically,  Image  Standards  Agencies  the  Department  of  Defense  should  participate 
with  are  identified.  Additionally,  DoD  is  advised  to  pursue  relationships  with 
commercial  producers.  As  more  products  are  purchased  commercial-off-the-shelf,  the 
Department  should,  as  a  customer,  work  with  its  providers.  This  means  contacting 
organizations  such  as  Adobe  to  explain  how  DoD  might  benefit  from  improvements  to 
PDF,  or  deal  with  the  consortium  developing  FlashPix,  to  share  needs  before  the 
specifications  are  complete. 

The  selection  criteria  established  by  the  group  narrowed  the  assemblage  of  standards  to 
consider  to  less  than  a  dozen.  None  of  this  dozen  were  immediately  eliminated  from 
consideration,  because  they  need  further  research.  One,  SGML,  should  be  accepted  by 
the  National  Archives  and  the  Department.  The  caveat  is  that  DoD  organizations  now 
using  SGML  should  be  allowed  to  preserve  and  deliver  to  NARA  their  data  in  this 
format.  The  high  cost  of  taking  documents  into  this  format  makes  it  ineffective  as  a 
device  for  storing  all  archives.  For  example,  storing  the  one  billion  pages  subject  to 
imaging  and  redaction  in  an  ongoing  declassification  effort  could  cost  as  much  as 
$4,750,000,000,  an  impossible  figure  to  justify  in  the  budget. 

To  complete  this  study,  the  department  needs  to 

•  Determine  the  life  cycle  costs  of  using  each  potential  standard  and  plan  methods 
for  the  inclusion  of  this  figure  in  the  DoD  budget. 

•  Survey  organizations  to  determine  volume  and  formats  of  images 

•  Assemble  a  team  of  experts  to  include  new  profiles,  standards,  conformance 
testing  and  certification  efforts,  test  suite  generation  and  promulgation  efforts,  and  joint 
industry  and  government  initiatives. 

•  Complete  the  evaluation  of  existing  standards  using  the  workshop  criteria 

•  Update  and  revise  affected  DoD  publications  as  required 

•  As  technology  advances,  review  the  imagery  policies  and  standards  and  update 
on  a  periodic  basis 

•  Evaluate  the  need  for  a  DoD  digital  library  allowing  access  to  and  less  costly 
preservation  of  data  prior  to  archiving  with  NARA 

Taking  this  project  to  its  next  logical  steps  will  provide  DoD  with  a  way  to  follow  the 
instructions  of  archiving  laws  while  offering  greater  access  to  digital  information. 


33 


ACRONYMS 


A-V  -  Audio-Visual 

ANSI  -  American  National  Standards  Institute 

ASCII  -  American  Standard  Code  for  Information  Interchange 

C4I  -  Command,  Control,  Communications,  Computers  and  Intelligence 

CAD  -  Computer  Aided  Design 

CALL  -  Center  for  Army  Lessons  Learned 

CALS  -  Continuous  Acquisition  and  Lifecycle  Support 

CAT  -  Computerized  Axial  Tomography 

CD-ROM  -  Computer  Disc  -  Read  Only  Memory 

COM  -  Computer  Graphics  Metafile 

COTS  -  Commercial-Off-The-Shelf 

DAPS  -  Defense  Automated  Printing  Service 

DIA  -  Defense  Intelligence  Agency 

DISA  -  Defense  Information  Services  Agency 

DoD  -  Department  of  Defense 

DTD  -  Document  Type  Definition 

DWG  -  An  AutoCAD  two-dimensional  Drawing  file  format 
DXF  -  Data  Exchange  Format 

EBCDIC  -  Extended  Binary  Coded  Decimal  Interchange  Code 

EKG  -  Electrocardiogram 

FRC  -  Federal  Records  Center 

GIF  -  Graphics  Interchange  Format 

GIS  -  Geographical  Information  System 

HPGL  -  Hewlett  Packard  Graphics  Language 

HTML  -  Hypertext  Mark-Up  Language 

lAB  -  Internet  Architecture  Board 

IEEE  -  Institute  for  Electrical  and  Electronic  Engineers 

IETF  -  Internet  Engineering  Task  Force 

IGES  -  Initial  Graphic  Exchange  Specification 

IPO  -  IGES/PDES  Organization 

ISO  -  International  Organization  for  Standardization 

ISOC  -  Internet  Society 

ITU  -  International  Telecommunication  Union 

J-STARS  -  Joint  Surveillance  Target  Attack  Radar  System 

JBIG  -  Joint  Bi-Level  Imaging  Group 

JFIF  -  JPEG  File  Interchange  Format 

JPEG  -  Joint  Photographic  Experts  Group 

JTA  -  Joint  Technical  Architecture 

ME  10  -  Two-dimensional  CAD  product  from  Hewlett-Packard 

MPEG  -  Motion  Picture  Experts  Group 

MRI  -  Magnetic  Resonance  Indicator 

NARA  -  National  Archives  and  Records  Administration 


34 


NASA  -  National  Air  and  Space  Administration 
NIMA  -  National  Imagery  and  Mapping  Agency 
0MB  -  Office  of  Management  and  Budget 
OPIO  -  Operational  Process  Improvement  Office 
PDE  -  Product  Data  Exchange 
PDES  -  Product  Data  Exchange  Specification 
PDF  -  Portable  Document  Format 

PKZIP  -  Commercial  compression/decompression  product 

SGML  -  Standard  Generalised  Mark-up  Language 

SPIFF  -  Still  Picture  Interchange  File  Format 

STEP  -  Standard  for  the  Exchange  of  Product  Model  Data 

TAFIM  -  Teclinical  Architecture  Framework  for  Information  Management 

TIFF  -  Tagged  Image  File  Format 

US  PRO  -  United  States  Product  Data  Association 

VHS  -  Standard  format  for  video  cassette  recorders 


35 


List  of  Workshop  Participants 


Two  facilitated  groupware  supported  workshops,  held  on  October  30-31, 1996  and  March 
4-6,  1 997  aided  in  gathering  information  for  this  research. 

Participants  at  one  or  both  workshops  were: 


Dr.  Bruce  Ambacher 

National  Archives 

bruce .  ambacher@arch2 .  nara.  gov 

Mr.  Russell  Anderson 

National  Imagery  and  Mapping  Agency 
Records  Manager,  MSAW,  N-42 
4600  Sangamore  Road 
Bethesda,  MD  20816 

Mr.  Edward  Arnold 

HQ  Department  of  the  Army 
(ODISC4) 

107  Army  Pentagon 
Washington,  DC  203 1 0-0 1 07 
amolew@hqda.army.mil 

Ms.  Ann  Barnes 

Defense  Automated  Printing  Service 
ann_bames@ddas.mil 

Dr.  Doris  Bernardini 

Defense  Info  Systems  Agency 
bemardd@ncr.disa.mil 

Dr.  Howard  Besser 

School  of  Information  Management  &  Systems 
University  of  California  Berkeley 
Berkeley,  CA  94720-460 
howard@sims.berkeley.edu 

Ms.  Randy  Bixby 

Defense  Technical  Information  Center 
bixby@dtic.mil 


36 


Ms.  Edna  Campbell 

OSD  C3  ITD 

125  S.  Jeff  Davis  Hwy.,  Suite  910 
Arlington,  VA  22202 

Mr.  George  S.  Carson 

GSC  Associates  Inc. 
carson@siggraph .  org 

Mr.  Robert  Chadduck 

The  National  Archives 
robert.chadduck@  arch2.nara.gov 

Mr.  William  T.  Crocca 

Xerox  Corporation 
wiIliam_crocca@xn.xerox.com 

Mr.  Allen  Easterly 

Defense  Logistics  Agency 
Administrative  Support  Center 
8725  John  J.  Kingman  Dr 
Ft.  Belvoir,  VA  22060 
allen_easterly@hq.dla.mil 

Ms.  Joanne  Flanagan 
ANSI/SO  Standards  Program 
AIIM  International 
1100  Wayne  Ave.,  Ste.  1100 
Silver  Spring,  MD  20910 
jflanagan@aiim.org 

LT  Helena  Gilbert 

U.S.  Navy 

Ms.  Karen  L.  Hampton 

U.S.  Army  Publications  &  Printing  Command 

Publications  &  Records  Management  Center 

Records  Meinagement  Division 

6000  6th  Street,  Stop  C55 

Ft.  Belvoir,  VA  22060 

hamptonk@rmpo.  uelvoir.army.mil 


37 


Ms.  Ruby  Harney 

0ASD(C31) 

Crystal  Gateway  2,  Suite  910 
Arlington,  VA  22202 
ruby.hamey@osd.niil 

Dr.  Janies  E.  Lightfoot 

Logicon 

2100  Washington  Blvd 
Arlington,  VA  22204-5710 
lightfoj@ncr.disa.mil 

Mr.  Paul  Lissy 

Xerox  Corporation 
paul_lissy@co.xerox.com 

Ms.  Fredricka  Molock 

Defense  Automated  Printing  Service 

fredricka_molock@ddas.mil 

Mr.  D.  Burton  Newlin 

OSD(C31)C3/IT 

burt.newlin@osd.pentagon.mil 

Dr.  Ed  Otto 

Logicon  Communication  Technology 
eotto@logicon.com 

Mr.  Alan  Peltzman 

DISA,  Center  for  Standards 
peltzmaa@ncr.disa.mil 

Mr.  John  Pratt 

Logicon  Communications  Technology 
jpratt@logicon.com 

Mr.  Jay  N.  Rivest 

HQ  USAF/SC 
rivestj  @pentagon.af.mil 

Ms.  Nasra  A.  Sakran 

IBM  Corporation-Government  Industry 

nsakran@vnet.ibm.com 


38 


Mr.  K.  Pete  Suthard 

Defense  Technical  Information  Center 
psuthard@dtic.mil 


Facilitators: 

Mr.  William  Lewis 

DISA  D621 
5600  Columbia  Pike 
Falls  Church,  VA  22041 
lewisw@ncr.disa.mil 

Mr.  Keith  McConnelly 
OPIO 

mcconnek@ncr .  di  sa.  mil 

Ms.  Alison  Zuna 
OPIO 

Zunaa@ncr.disa.mil 


39 


ENDNOTES 


’  “Long-Term  Management  Issues  in  the  Preservation  of  Electronic  Information,”  paper  presented  by 
Maggie  Exon,  School  of  Information  and  Library  Studies,  Curtin  University  of  Technology  at  the  2nd 
National  Preservation  Office  Conference:  Multimedia  Preservation  -  Capturing  the  Rainbow,  in  Brisbane, 
Australia,  28-30  November  1995. 

"  Internet  document  “What  is  the  National  Archives,”  a  web  page  maintained  by  the  National  Archives. 
Access  point  as  of  2/26/97  was  http://www.nara.gov/nara/whatis/records.html. 

^  Internet  document,  NARA  source  titled  “Introduction:  A  Rich  Information  Resource.”  Access  point  as  of 
3/1 7/97  was  http://gopher.nara.gov:70/0/inform/guide/intronag.txt. 

Derived  from  compilation  of  accessions  provided  by  Ms.  Sharon  Thibodeau,  National  Archives  and 
Records  Administration,  Washington,  DC,  1996. 

^  Internet  document  “Ready  Access  to  Essential  Evidence:  The  Strategic  Plan  of  the  National  Archives  and 
Records  Administration  1997-2007.”  Access  point  as  of  8/21/96  was 
http://www.nara.gov/nara/vision/naraplan.htmI. 

^  “Preserving  Scientific  Data  On  Our  Physical  Universe:  A  New  Strategy  for  Archiving  the  Nation’s 
Scientific  Information  Resources.”  compiled  by  the  Steering  Committee  for  the  Study  on  the  Long-Term 
Retention  of  Selected  Scientific  and  Technical  Records  of  the  Federal  Government;  Commission  on 
Physical  Sciences,  Mathematics,  and  Applications;  National  Research  Council.  Published  by  the  National 
Academy  Press,  Washington,  DC,  1995. 

’  Conversation  with  Jane  Bossert  from  Library  of  Congress  Digital  Library,  First  DoD  Archival  Workshop, 
October  31,  1996. 

^  e-mail  from  Defense  Automated  Printing  Service,  subject  Document  Conversion  Pricing,  dated  April  10, 
1997,  and  information  provided  by  Michelle  Spiro  on  April  29,  1997. 

^  From  “Document  Services  Offered  by  Defense  Automated  Printing  Service”  an  undated  Defense 
Automated  Printing  Service  slide  presentation  provided  by  Michelle  Spiro. 

Ibid. 


40 


